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FOREWORD 


I n bygone centuries, our physical world appeared to be filled to the brim with mysteries. Divine powers 
could provide for genuine miracles; water and sunlight could turn arid land into fertile pastures, but the 
same powers could lead to miseries and disasters. The force of life, the vis vitalis, was assumed to be the 
special agent responsible for all living things. The heavens, whatever they were for, contained stars and other 
heavenly bodies that were the exclusive domain of the Gods. 

Mathematics did exist, of course. Indeed, there was one aspect of our physical world that was recognised to 
be controlled by precise, mathematical logic: the geometric structure of space, elaborated to become a genuine 
form of art by the ancient Greeks. From my perspective, the Greeks were the first practitioners of ‘mathematical 
physics’, when they discovered that all geometric features of space could be reduced to a small number of 
axioms. Today, these would be called ‘fundamental laws of physics’. The fact that the flow of time could be 
addressed with similar exactitude, and that it could be handled geometrically together with space, was only 
recognised much later. And, yes, there were a few crazy people who were interested in the magic of numbers, 
but the real world around us seemed to contain so much more that was way beyond our capacities of analysis. 

Gradually, all this changed. The Moon and the planets appeared to follow geometrical laws. Galilei and 
Newton managed to identify their logical rules of motion, and by noting that the concept of mass could be 
applied to things in the sky just like apples and cannon balls on Earth, they made the sky a little bit more 
accessible to us. Electricity, magnetism, light and sound were also found to behave in complete accordance 
with mathematical equations. 

Yet all of this was just a beginning. The real changes came with the twentieth century. A completely new 
way of thinking, by emphasizing mathematical, logical analysis rather than empirical evidence, was pioneered 
by Albert Einstein. Applying advanced mathematical concepts, only known to a few pure mathematicians, to 
notions as mundane as space and time, was new to the physicists of his time. Einstein himself had a hard 
time struggling through the logic of connections and curvatures, notions that were totally new to him, but are 
only too familiar to students of mathematical physics today. Indeed, there is no better testimony of Einstein’s 
deep insights at that time, than the fact that we now teach these things regularly in our university classrooms. 

Special and general relativity are only small corners of the realm of modern physics that is presently being 
studied using advanced mathematical methods. We have notoriously complex subjects such as phase transitions in 
condensed matter physics, superconductivity, Bose-Einstein condensation, the quantum Hall effect, particularly 
the fractional quantum Hall effect, and numerous topics from elementary particle physics, ranging from fibre 
bundles and renormalization groups to supergravity, algebraic topology, superstring theory, Calabi-Yau spaces 
and what not, all of which require the utmost of our mental skills to comprehend them. 

The most bewildering observation that we make today is that it seems that our entire physical world 
appears to be controlled by mathematical equations, and these are not just sloppy and debatable models, but 
precisely documented properties of materials, of systems, and of phenomena in all echelons of our universe. 

Does this really apply to our entire world, or only to parts of it? Do features, notions, entities exist that are 
emphatically mot mathematical? What about intuition, or dreams, and what about consciousness? What 
about religion? Here, most of us would say, one should not even try to apply mathematical analysis, although 
even here, some brave social scientists are making attempts at coordinating rational approaches. 
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No, there are clear and important differences between the physical world and the mathematical world. 
Where the physical world stands out is the fact that it refers to ‘reality’, whatever ‘reality’ is. Mathematics is 
the world of pure logic and pure reasoning. In physics, it is the experimental evidence that ultimately decides 
whether a theory is acceptable or not. Also, the methodology in physics is different. 

A beautiful example is the serendipitous discovery of superconductivity. In 1911, the Dutch physicist Heike 
Kamerlingh Onnes was the first to achieve the liquefaction of helium, for which a temperature below 4.25 K 
had to be realized. Heike decided to measure the specific conductivity of mercury, a metal that is frozen solid 
at such low temperatures. But something appeared to go wrong during the measurements, since the volt 
meter did not show any voltage at all. All experienced physicists in the team assumed that they were dealing 
with a malfunction. It would not have been the first time for a short circuit to occur in the electrical 
equipment, but, this time, in spite of several efforts, they failed to locate it. One of the assistants was 
responsible for keeping the temperature of the sample well within that of liquid helium, a dull job, requiring 
nothing else than continuously watching some dials. During one of the many tests, however, he dozed off. 
The temperature rose, and suddenly the measurements showed the normal values again. It then occurred to 
the investigators that the effect and its temperature dependence were completely reproducible. Below 4.19 
degrees Kelvin the conductivity of mercury appeared to be strictly infinite. Above that temperature, it is 
finite, and the transition is a very sudden one. Superconductivity was discovered (D. van Delft, “Heike 
Kamerling Onnes", Uitgeverij Bert Bakker, Amsterdam, 2005 (in Dutch)). 

This is not the way mathematical discoveries are made. Theorems are not produced by assistants falling 
asleep, even if examples do exist of incidents involving some miraculous fortune. 

The hybrid science of mathematical physics is a very curious one. Some of the topics in this Encyclopedia 
are undoubtedly physical. High T; superconductivity, breaking water waves, and magneto-hydrodynamics, 
are definitely topics of physics where experimental data are considered more decisive than any high-brow 
theory. Cohomology theory, Donaldson-Witten theory, and AdS/CFT correspondence, however, are examples 
of purely mathematical exercises, even if these subjects, like all of the others in this compilation, are strongly 
inspired by, and related to, questions posed in physics. 

It is inevitable, in a compilation of a large number of short articles with many different authors, to see quite a 
bit of variation in style and level. In this Encyclopedia, theoretical physicists as well as mathematicians together 
made a huge effort to present in a concise and understandable manner their vision on numerous important 
issues in advanced mathematical physics. All include references for further reading. We hope and expect that 
these efforts will serve a good purpose. 


Gerard 't Hooft, 
Spinoza Institute, 


Utrecht University, 
The Netherlands. 


PREFACE 


athematical Physics as a distinct discipline is relatively new. The International Association of 

Mathematical Physics was founded only in 1976. The interaction between physics and mathematics 
has, of course, existed since ancient times, but the recent decades, perhaps partly because we are living 
through them, appear to have witnessed tremendous progress, yielding new results and insights at a dizzying 
pace, so much so that an encyclopedia seems now needed to collate the gathered knowledge. 

Mathematical Physics brings together the two great disciplines of Mathematics and Physics to the benefit of 
both, the relationship between them being symbiotic. On the one hand, it uses mathematics as a tool to 
organize physical ideas of increasing precision and complexity, and on the other it draws on the questions 
that physicists pose as a source of inspiration to mathematicians. A classical example of this relationship 
exists in Einstein’s theory of relativity, where differential geometry played an essential role in the formulation 
of the physical theory while the problems raised by the ensuing physics have in turn boosted the development 
of differential geometry. It is indeed a happy coincidence that we are writing now a preface to an 
encyclopedia of mathematical physics in the centenary of Einstein’s annus mirabilis. 

The project of putting together an encyclopedia of mathematical physics looked, and still looks, to us a 
formidable enterprise. We would never have had the courage to undertake such a task if we did not believe, 
first, that it is worthwhile and of benefit to the community, and second, that we would get the much-needed 
support from our colleagues. And this support we did get, in the form of advice, encouragement, and 
practical help too, from members of our Editorial Advisory Board, from our authors, and from others as well, 
who have given unstintingly so much of their time to help us shape this Encyclopedia. 

Mathematical Physics being a relatively new subject, it is not yet clearly delineated and could mean 
different things to different people. In our choice of topics, we were guided in part by the programs of recent 
International Congresses on Mathematical Physics, but mainly by the advice from our Editorial Advisory 
Board and from our authors. The limitations of space and time, as well as our own limitations, necessitated 
the omission of certain topics, but we have tried to include all that we believe to be core subjects and to cover 
as much as possible the most active areas. 

Our subject being interdisciplinary, we think it appropriate that the Encyclopedia should have certain 
special features. Applications of the same mathematical theory, for instance, to different problems in physics 
will have different emphasis and treatment. By the same token, the same problem in physics can draw upon 
resources from different mathematical fields. This is why we divide the Encyclopedia into two broad sections: 
physics subjects and related mathematical subjects. Articles in either section are deliberately allowed a fair 
amount of overlap with one another and many articles will appear under more than one heading, but all are 
linked together by elaborate cross referencing. We think this gives a better picture of the subject as a whole 
and will serve better a community of researchers from widely scattered yet related fields. 

The Encyclopedia is intended primarily for experienced researchers but should be of use also to beginning 
graduate students. For the latter category of readers, we have included eight elementary introductory articles for easy 
reference, with those on mathematics aimed at physics graduates and those on physics aimed at mathematics 
graduates, so that these articles can serve as their first port of call to enable them to embark on any of the main 
articles without the need to consult other material beforehand. In fact, we think these articles may even form the 
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foundation of advanced undergraduate courses, as we know that some authors have already made such use of them. 

In addition to the printed version, an on-line version of the Encyclopedia is planned, which will allow both 
the contents and the articles themselves to be updated if and when the occasion arises. This is probably a 
necessary provision in such a rapidly advancing field. 

This project was some four years in the making. Our foremost thanks at its completion go to the members 
of our Editorial Advisory Board, who have advised, helped and encouraged us all along, and to all our 
authors who have so generously devoted so much of their time to writing these articles and given us much 
useful advice as well. We ourselves have learnt a lot from these colleagues, and made some wonderful 
contacts with some among them. Special thanks are due also to Arthur Greenspoon whose technical expertise 
was indispensable. 

The project was started with Academic Press, which was later taken over by Elsevier. We thank warmly 
members of their staff who have made this transition admirably seamless and gone on to assist us greatly in 
our task: both Carey Chapman and Anne Guillaume, who were in charge of the whole project and have been 
with us since the beginning, and Edward Taylor responsible for the copy-editing. And Martin Ruck, who 
manages to keep an overwhelming amount of details constantly at his fingertips, and who is never known to 
have lost a single email, deserves a very special mention. 

As a postscript, we would like to express our gratitude to the very large number of authors who generously 
agreed to donate their honorariums to support the Committee for Developing Countries of the European 
Mathematical Society in their work to help our less fortunate colleagues in the developing world. 


Jean-Pierre Francoise 
Gregory L. Naber 
Tsou Sheung Tsun 


GUIDE TO USE OF THE ENCYCLOPEDIA 


Structure of the Encyclopedia 


The material in this Encyclopedia is organised into two sections. At the start of Volume 1 are eight Introductory Articles. 
The introductory articles on mathematics are aimed at physics graduates; those on physics are aimed at mathematics 
graduates. It is intended that these articles should serve as the first port of call for graduate students, to enable them to 
embark on any of the main entries without the need to consult other material beforehand. 

Following the Introductory Articles, the main body of the Encyclopedia is arranged as a series of entries in alphabetical 
order. These entries fill the remainder of Volume 1 and all of the subsequent volumes (2-5). 

To help you realize the full potential of the material in the Encyclopedia we have provided four features to help you find 
the topic of your choice: a contents list by subject, an alphabetical contents list, cross-references, and a full subject index. 


1. Contents List by Subject 


Your first point of reference will probably be the contents list by subject. This list appears at the front of each volume, 
and groups the entries under subject headings describing the broad themes of mathematical physics. This will enable the 
reader to make quick connections between entries and to locate the entry of interest. The contents list by subject is divided 
into two main sections: Physics Subjects and Related Matbematics Subjects. Under each main section heading, you will 
find several subject areas (such as GENERAL RELATIVITY in Physics Subjects or NONCOMMUTATIVE GEOMETRY 
in Related Mathematics Subjects). Under each subject area is a list of those entries that cover aspects of that subject, 
together with the volume and page numbers on which these entries may be found. 

Because mathematical physics is so highly interconnected, individual entries may appear under more than one subject 
area. For example, the entry GAUGE THEORY: MATHEMATICAL APPLICATIONS is listed under the Physics Subject 
GAUGE THEORY as well as in a broad range of Related Mathematics Subjects. 


2. Alphabetical Contents List 


The alphabetical contents list, which also appears at the front of each volume, lists the entries in the order in which they 
appear in the Encyclopedia. This list provides both the volume number and the page number of the entry. 

You will find *dummy entries" where obvious synonyms exist for entries or where we have grouped together related 
topics. Dummy entries appear in both the contents list and the body of the text. 


Example 
If you were attempting to locate material on path integral methods via the alphabetical contents list: 


PATH INTEGRAL METHODS see Functional Integration in Quantum Physics; Feynman Path Integrals 


The dummy entry directs you to two other entries in which path integral methods are covered. At the appropriate 
locations in the contents list, the volume and page numbers for these entries are given. 

If you were trying to locate the material by browsing through the text and you had looked up Path Integral Methods, 
then the following information would be provided in the dummy entry: 


Path Integral Methods see Functional Integration in Quantum Physics; Feynman Path Integrals 
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3. Cross-References 


All of the articles in the Encyclopedia have been extensively cross-referenced. The cross-references, which appear at the 
end of an entry, serve three different functions: 


i. To indicate if a topic is discussed in greater detail elsewhere. 
ii. To draw the reader’s attention to parallel discussions in other entries. 


iii. To indicate material that broadens the discussion. 


Example 
The following list of cross-references appears at the end of the entry STOCHASTIC HYDRODYNAMICS 


See also: Cauchy Problem for Burgers-Type Equations; Hamiltonian 
Fluid Dynamics; Incompressible Euler Equations: Mathematical Theory; 
Malliavin Calculus; Non-Newtonian Fluids; Partial Differential Equations: 
Some Examples; Stochastic Differential Equations; Turbulence Theories; 
Viscous Incompressible Fluids: Mathematical Theory; Vortex Dynamics 


Here you will find examples of all three functions of the cross-reference list: a topic discussed in greater detail elsewhere 
(e.g. Incompressible Euler Equations: Mathematical Theory), parallel discussion in other entries (e.g. Stochastic Differ- 
ential Equations) and reference to entries that broaden the discussion (e.g. Turbulence Theories). 

The eight Introductory Articles are not cross-referenced from any of the main entries, as it is expected that introductory 
articles will be of general interest. As mentioned above, the Introductory Articles may be found at the start of Volume 1. 


4. Index 


The index will provide you with the volume and page number where the material is located. The index entries 
differentiate between material that is a whole entry, is part of an entry, or is data presented in a figure or table. Detailed 
notes are provided on the opening page of the index. 
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Introduction 


This article gives a brief discussion of a topic with 
an enormous literature, namely the stability/instabil- 
ity of fluid flows. Following the seminal observa- 
tions and experiments of Reynolds in 1883, the issue 
of stability of a fluid flow became one of the central 
problems in fluid dynamics: stable flows are robust 
under inevitable disturbances in the environment, 
while unstable flows may break up, sometimes 
rapidly. These possibilities were demonstrated in a 
relatively simple experiment where flow in a pipe is 
examined at increasing speeds. As a dimensionless 
parameter (now known as the Reynolds number) 
increases, the flow completely changes its nature 
from a stable flow to a completely different regime 
that is irregular in space and time. Reynolds called 
this “turbulence” and observed that the transition 
from the simple flow to the chaotic flow was caused 
by the phenomenon of instability. 

Even though the topic has been the subject of 
intense study over more than a century, Reynolds 
experiment is still not fully explained by current 
theory. Although there is no rigorous proof of 
stability of the simple flow (known as Poiseuille 
flow in a circular pipe), analytical and numerical 
investigations of the equations suggest theoretical 
stability for all Reynolds numbers. However, experi- 
ments show instability for sufficiently large 
Reynolds numbers. A plausible explanation for this 
phenomenon is the instability of such flows with 
respect to small but finite disturbances combined 
with their stability to infinitesimal disturbances. 

The issue of fluid stability, in contexts much 
more complex than the fundamental experiment of 
Reynolds, arises in a multitude of branches of 
science, including engineeering, physics, astrophy- 
sics, oceanography, and meteorology. It is far 
beyond the scope of this short article to even 
touch upon most of the extensive literature. In the 
bibliography we list just a few of the substantive 
books where classical results can be found 
(Chandrasekhar 1961, Drazin and Reid 1981, 
Gershuni and Zhukovitiskii 1976, Joseph 1976, 
Lin 1967, Swinney and Gollub 1985). Recent 
extensive bibliographies on mathematical aspects 
of fluid instability are given in several articles in the 
Handbook of Mathematical Fluid | Dynamics 


(Friedlander and Serre 2003) and the compendium 
of articles on hydrodynamics and nonlinear 
instabilities in Godreche and Maneville (1998). 


The Equations of Motion 


The Navier-Stokes equations for the motion of an 
incompressible, constant density, viscous fluid are 


1 
q-V)qu- NP + uw g [1a] 


divq — 0 1b] 


where q(x,t) denotes the velocity vector, P(x,t) the 
pressure, and the constants p and v are the density 
and kinematic viscosity, respectively. This system is 
considered in three (or sometimes two) spatial 
dimensions with a specified initial velocity field 


q(x, 0) = qo(x) [1c] 


and physically appropriate boundary conditions: for 
example, zero velocity on a rigid boundary, or 
periodicity conditions for flow on a torus. This 
nonlinear system of partial differential equations 
(PDEs) has proved to be remarkably challenging, 
and in three dimensions the fundamental issues of 
existence and uniqueness of physically reasonable 
solutions are still open problems. 

It is often useful to consider the Navier-Stokes 
equations in nondimensional form by scaling the 
velocity and length by some intrinsic scale in the 
problem, for example, in Reynolds’ experiment by 
the mean speed U and the diameter of the pipe d. 
This leads to the nondimensional equations 


1 
aH Ur Wa e VP Vg [2a] 


divg = 0 [2b] 
where the Reynolds number R is 
R = Ud/v [3] 


In many situations, the size of R has a crucial 
influence on stability. Roughly speaking, when R is 
small the flow is very sluggish and likely to be 
stable. However, the effects of viscosity are actually 
very complicated and not only is viscosity able to 
smooth and stabilize fluid motions, sometimes it 
actually also destroys and destabilizes flows. 

The Euler equations, which predate the Navier- 
Stokes equations by many decades, neglect the 
effects of viscosity and are obtained from [la] by 
setting the viscosity parameter v to zero. Since this 


2 Stability of Flows 


removes the highest-derivative term from the equa- 
tions, the nature of the Euler equations is funda- 
mentally different from that of the Navier-Stokes 
equations and the limit of vanishing viscosity (or 
infinite Reynolds number) is a very singular limit. 
Since all real fluids are at least very weakly viscous, 
it could be argued that only the the Navier-Stokes 
equations are physically relevant. However, many 
important physical phenomena, such as turbulence, 
involve flows at very high Reynolds numbers (10* or 
higher). Hence, an understanding of turbulence is 
likely to involve the asymptotics of the Navier— 
Stokes equations as R — oc. The first step towards 
the construction of such asymptotics is the study of 
inviscid fluids governed by the Euler equations: 


Ó 
aL + (q-V)q=-VP [4a] 


divq = 0 [4b] 


Stability issues for the Euler equations are in many 
respects distinct from those of the Navier-Stokes 
equations and in this article we will briefly touch 
upon stability results for both systems. 


Comments on Some “Classical” 
Instabilities 


To illustrate the complexity of the structure of 
instabilities that can arise in the Navier-Stokes 
equations, we mention one classical example, 
namely the centrifugal instabilities called Taylor- 
Couette instabilities. Consider a fluid between two 
concentric cylinders rotating with different angular 
velocities. If the inner cylinder rotates sufficiently 
faster than the outer one, the centrifugal force is 
stronger on inside particles than outside particles 
and a disturbance which exchanges the radial 
position of particles is enhanced, that is, the 
configuration is unstable. As the angular velocity 
of the inner cylinder is increased above a certain 
critical rate, the instability is manifested in a series 
of small toroidal (Taylor) vortices that fill the space 
between the cylinders. There follows a hierarchy of 
successive instabilities: azimuthal traveling waves, 
twisting regimes, and quasiperiodic regimes until 
chaotic solutions appear. Such a sequence of 
bifurcations is a scenario for a transition to 
turbulence postulated by Ruelle-Takens. Details 
concerning bifurcation theory and fluid behavior 
can be found in the book of Chossat and Iooss 
(1994). 

We note that phenomena of successive bifurca- 
tions connected with loss of stability, such as 
regimes of Taylor-Couette instabilities, occur at 


moderately large Reynolds numbers. Fully devel- 
oped turbulence is a phenomenon associated with 
very high Reynolds numbers. These are parameter 
regimes basically inaccessible in current numerical 
investigations of the Navier-Stokes equations and 
turbulent models. The Euler equations lie at the 
limit as R — oo. It is an interesting observation that 
results at the limit of infinite Reynolds number are 
sometimes also applicable and consistent with 
experiments for flows with only moderate Reynolds 
number. 

There is a huge diversity of forces that couple 
with fluid motion to produce instability. We will 
merely mention a few of these which an interested 
reader could pursue in consultation with texts listed 
in the “Further reading" section and references 
therein. 


1. The so-called Bénard problem of convective 
instability concerns a horizontal layer of fluid 
between parallel plates and subject to a tempera- 
ture gradient. The governing equations are the 
Navier-Stokes equation for a nonconstant den- 
sity fluid and the heat equation. In this problem, 
the critical parameter governing the onset of 
instability is called the Rayleigh number. The 
patterns that can develop as a result of instability 
are strongly influenced by the boundary condi- 
tions in the horizontal coordinates. With lattice 
type conditions, bifurcating solutions include 
rolls, rectangles, and hexagons. Convection rolls 
are themselves subject to secondary instabilities 
that may break the translation symmetry and 
deform the rolls into meandering shapes. Further 
refinements of convective instabilities include 
doubly diffusive convection, where the density 
depends on concentration as well as temperature. 
Competition between stabilizing diffusivity and 
destabilizing diffusivity can lead to the so-called 
“salt-finger” instabilities. 

2. Of considerable interest in astrophysics and 
plasma physics are the instabilities that occur in 
electrically conducting fluilds. Here the fluid 
equations are coupled with Maxwell’s equations. 
Much work has been done on the topic of 
magnetohydrodynamical (MHD) stability, which 
was developed to address various important 
physical issues such as thermonuclear fusion, 
stellar and planetary interiors, and dynamo 
theory. For example, dynamo theory addresses 
the issue of how a magnetic field can be 
generated and sustained by the motion of an 
electrically conducting fluid. In the simplest 
scenario, the fluid motion is assumed to be a 
given divergence-free vector field and the study of 


the instabilities that may occur in the evolution 
of the magnetic field is called the kinematic 
dynamo problem. This gives rise to interesting 
problems in dynamical systems and actually is 
closely analogous to the topic of vorticity 
generation in the three-dimensional (3D) fluid 
equations in the absence of MHD effects. 


In the next section we discuss certain mathema- 
tical results that have been rigorously proved for 
particular problems in the stability of fluid flows. 
We restrict our attention to the “basic” equations, 
that is, [2a] and [2b], [4a] and [4b], observing that 
even in rather simple configurations there are still 
more open problems than precise rigorous results. 


The Navier-Stokes Equations: 
Mathematical Definitions of 
Stability/Instability 


Instability occurs when there is some disturbance of 
the internal or external forces acting on the fluid 
and, loosely speaking, the question of stability or 
instability considers whether there exist disturbances 
that grow with time. There are many mathematical 
definitions of stability of a solution to a PDE. Most 
of these definitions are closely related but they may 
not be equivalent. Because of the distinctly different 
nature of the Navier-Stokes equations for a viscous 
fluid and the Euler equations for an inviscid fluid, 
we will adopt somewhat different precise definitions 
of stability for the two systems of PDEs. Both 
definitions are related to the concept known as 
Lyapunov stability. A steady state described by a 
velocity field Uo(x) is called Lyapunov stable if 
every state q(x,t) “close” to Up(x) at t=O stays 
close for all + > 0. In mathematical terms, “close- 
ness” is defined by considering metrics in a normed 
space X. While in finite-dimensional systems the 
choice of norm is not significant because all Banach 
norms are equivalent, in infinite-dimensional sys- 
tems, such as a fluid configuration, this choice is 
crucial. The point was emphasized by Yudovich 
(1989) and it is a version of the definition of 
stability given in this book that we will adopt in 
connection with the parabolic Navier-Stokes 
equations. 


Definitions for a General Nonlinear 
Evolution Equation 
Consider an evolution equation for u(x,t) whose 


phase space is a Banach space X: 


a 
5; — Lu + N(u, u) 
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We assume that if the initial value 4(x,0) € X is 
given, the future evolution u(x,t),t » 0, of the 
equation is uniquely defined (at least for sufficiently 
small initial data). Without loss of generality, we 
can assume that zero is a steady state. 

We define a version of Lyapunov (nonlinear) 
stability and its converse instability. 


Definition 1 Let (X, Z) be a pair of Banach spaces. 
The zero steady state is called (X, Z) nonlinearly 
stable if, no matter how small e > 0, there exists 
6>0 so that u(x,0) € X and 


l|u(x, 0)]lz < ô 


imply the following two assertions: 


(1) there exists a global in time solution such that 
u(x,t) € (([0, 00); X); 
(ii) |[u(x,£)||7 < e for a.e. t € [0, 00). 


The zero state is called nonlinearly unstable if either 
of the above assertions is violated. We note that 
under this strong definition of stability, loss of 
existence of a solution is a particular case of 
instability. The concept of existence that we will 
invoke in considering the Navier-Stokes equations is 
the existence of “mild” solutions introduced by Kato 
and Fujita (1962). Local-in-time existence of mild 
solutions is known in X— L7 for q > n, where n 
denotes the space dimension. (L? denotes the usual 
Lebesque space). 

We now state two theorems for the Navier-Stokes 
equations [2a] and [2b]. The theorems are valid in any 
space dimension # and in finite or infinite domains. Of 
course, the most physically relevant cases are n= 3 or 
2. Both theorems relate properties of the spectrum of 
the linearized Navier-Stokes equations to stability or 
instability of the full nonlinear system. Let 
Uo(x), Po(x) be a steady state flow: 


1 1 
(Uo - V)Uo = —VPo u La ia E [Sa] 


V. Us =0 [Sb] 


where Uy € C* vanishes on the boundary of the 
domain D and F is a suitable external force. We 
write [2a] and [2b] in perturbation form as 


g(x, t) = Uo(x) + u(x,t) G 
where 
"E N 7 
a ns + N(u, u) [7a] 
V.u=0 [7b] 
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with 


N(u,u) = —(u- V)u — VP [9] 


Here P, and P; are, respectively, the portions of the 
pressure required to ensure that Lysu and N(u, u) 
remain divergence free. The operators Lys and N act 
on the space of divergence-free vector-valued func- 
tions in the closure of the Sobolev space W*? that 
vanish on the boundary of D. 

We note that the spectrum of the elliptic linear 

operator Lys with appropriate boundary conditions 
in a bounded domain is purely discrete: that is, it 
consists of a countable number of eigenvalues of 
finite multiplicity with the sole limit point being at 
infinity. 
Theorem 2 (Nonlinear instability). Let 1 < p < oc 
be arbitrary. Suppose that tbe operator Lys over L^ 
bas spectrum in the rigbt balf of tbe complex plane. 
Then tbe flow Uo(x) is (L1, LP) nonlinearly unstable 
for any q > max(p,n). 


Theorem 3 (Asymptotic Lyapunov stability). Let 
q » n be arbitrary. Assume that the operator Lys 
over L4 has spectrum confined to the left half of the 
complex plane. Then the flow Upo(x) is (L?,L^4) 
nonlinearly stable. 


A recent proof of these theorems is given in 
Friedlander et al. (2006) using a bootstrap type 
argument. In Theorem 2, the space L4,q > n, is used 
as an auxiliary space inwhich the norm of the 
nonlinear term is controlled, while the final instabil- 
ity result is proved in L? for p € (1,00). We note 
that this includes the most physically relevant case 
of instability in the L* energy norm. An earlier proof 
of the theorems under the restriction p > n was 
given by Yudovich (1989). 

To apply Theorem 2 or 3 to conclude nonlinear 
instability or stability of a given flow Uo, it is 
necessary to have information concerning the spec- 
trum of the linear operator Lys. Obtaining such 
information has been the goal of much of the 
literature concerning fluid stability (see the biblio- 
graphy and the references therein). However, except 
in the case of some relatively simple flows, the 
eigenvalues of Lys have not yet been calculated 
explicitly. Perhaps the example that is the most 
tractable is plane parallel shear flows. Here the 
eigenvalue problem is governed by an ordinary 
differential equation (ODE) known as the Orr- 
Sommerfeld equation, which has been the subject of 


extensive analytical and numerical investigations. 
Consider the parallel flow Up =(U(z),0,0) in the 
strip —1 € z € 1. For disturbances of the form 


olz) ellkix+koy) e^ [10] 


the eigenvalue A is determined by the following 
equation with k? =k? + kj: 


= elo [11] 


with boundary conditions o — 0 at z= +1. We note 
that the discreteness of the spectrum is preserved if 
periodicity conditions are imposed in the (x,y) 
plane. 

The complexity of the spectral problem [11] is 
apparent even for the simple case U(z)=1-— 2? 
(known as plane Poiseuille flow). Unstable eigenva- 
lues exist but only in certain regions of (k,R) 
parameter space. There is a critical Reynolds number, 
R.— 5772, below which ReA <0 for all wave 
numbers k. For R > Re, instability occurs in a band 
of wave numbers and the thickness of this band 
shrinks to zero as R — oo (i.e. the inviscid limit). 
Hence, Poiseuille flow with R < Re can be considered 
as an example where the stability Theorem 3 can be 
applied, that is, the flow is nonlinearly stable to 
infinitesimal disturbances. However, extremely care- 
ful experiments are needed to obtain agreement with 
the theoretical value of Re = 5772. Rather it is more 
usual in an experiment with R — 2000 that the flow 
exhibits instability in the form of streamwise streaks 
that appear near the walls. These structures do not 
look like traveling waves of the form given by 
expression [10], rather they are finite-amplitude 
effects of nonmodal growth. Such linear growth of 
disturbances, along with energy growth and pseudos- 
pectra have recently been investigated extensively. 

An example where Theorem 3, proving nonlinear 
instability, can be applied is the so-called 
Kolmogorov flow. This is also a shear flow with the 
spectral problem for the linearized operator given by 
eqn [11]. In this example, the profile is oscillatory in z 
with U(z) — sin mz. In an elegant paper, Meshalkin 
and Sinai (1961) used continued fractions to prove 
the existence of a real unstable positive eigenvalue. It 
is interesting, and in some sense surprising, that the 
particular case of sinusoidal profiles leads to a 
nonconstant-coefficient eigenvalue problem, where 
it is possible to construct in explicit form the 
transcendental characteristic equation that relates 
the eigenvalues A and the wave numbers. Usually, 


this can be done only for constant-coefficient equa- 
tions. In the case U(z)= sin mz, a Fourier series 
representation for the eigenfunctions leads to a 
tridiagonal infinite matrix for the algebraic system 
satisfied by the Fourier coefficients. This is amenable 
to examination using continued fractions. Analysis of 
the characteristic equation shows that there exist real 
eigenvalues A > 0 provided R is larger than some 
critical value for each wave number k with k? < m. 


The Euler Equation: Linear and 
Nonlinear Stability/Instability 


We conclude this brief article with some discussion 
of instabilities in the inviscid Euler equations whose 
existence is likely to be important as a “trigger” for 
the development of instabilities in high-Reynolds- 
number viscous flows. As we mentioned, the Euler 
equations are very different from the Navier-Stokes 
equations in their mathematical structure. The 
Euler equations are degenerate and nonelliptic. As 
such, the spectrum of the linearized operator Lg is 
not amenable to standard spectral theory of elliptic 
operators. For example, unlike the Navier-Stokes 
operator, the spectrum of Lg is not purely discrete 
even in bounded domains. To define Lg we consider 
a steady Euler flow (Uo(x), Po(x)}, where 


Us Ñ VU?» = —V Po [12a] 


V.Us =0 [12b] 


We assume that Uy € C*. For the Euler equations, 
appropriate boundary conditions include zero nor- 
mal component of Up on a rigid boundary, or 
periodicity conditions (i.e., flow on a torus) or 
suitable decay at infinity in an unbounded domain. 
The theorems that we will be describing have been 
proved mainly in the cases of the second and third 
conditions stated above. There are many classes of 
vector fields Up(x), in two and three dimensions, 
that satisfy [12a] and [12b]. We write [4a] and [4b] 
in perturbation form as 


q(x, t) = Uo(x) + u(x, t) [13] 
with 
qu = Lgu + N(u,u) [14a] 
Ot 
V-u=0 [14b] 
Here 


Lgu = —(Up:V)u — (u-. V)Uy —- V Pi [15] 
N(u,u) = —(u-V)u-— V P2 [16] 
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Linear (spectral) instability of a steady Euler flow 
Uo(x) concerns the structure of the spectrum of Lr. 
Assuming Uy € C*(T"), the linear equation 


= Lew, VY-n=0 [17] 
defines a strongly continuous group in every Sobolev 
space W*? with generator Lg. We denote this group 
by exp {Lg t]. For the issue of spectral instability of 
the Euler equation it proves useful to study not only 
the spectrum of Lg but also the spectrum of the 
evolution operator exp {Legt}. This permits the 
development of an explicit formula for the growth 
rate of a small perturbation due to the essential (or 
continuous) spectrum. It was proved by Vishik 
(1996) that a quantity A, refered to as a "fluid 
Lyapunov exponent" gives the maximum growth 
rate of the essential spectrum of exp(Lgt). This 
quantity is obtained by computing the exponential 
growth rate of a certain vector that satisfies a 
specific system of ODEs over the trajectories of the 
flow Uo(x). This proves to be an effective mechan- 
ism for detecting instabilities in the essential 
spectrum which result due to high-spatial-frequency 
perturbations. For example, for this reason any flow 
Uo(x) with a hyperbolic fixed point is linearly 
unstable with growth in the sense of the L?-norm. 
In two dimensions, A is equal to the maximal 
classical Lyapunov exponent (i.e., the exponential 
growth of a tangent vector over the ODE x = Uo(x)). 
In three dimensions, the existence of a nonzero 
classical Lyapunov exponent implies that A > 0. 
However, in three dimensions there are also exam- 
ples where the classical Lyapunov exponent is zero 
and yet A > 0. We note that the delicate issue of the 
unstable essential spectrum is strongly dependent on 
the function space for the perturbations and that A, 
for a given Uo, will vary with this function space. 
More details and examples of instabilities in the 
essential spectrum can be found in references in the 
bibliography. 

In contrast with instabilities in the essential 
spectrum, the existence of discrete unstable eigenva- 
lues is independent of the norm in which growth is 
measured. From this point of view, such instabilities 
can be considered as “strong.” However, for most 
flows Uo(x) we do not know the existence of such 
unstable eigenvalues. For fully 3D flows there are no 
examples, to our knowledge, where such unstable 
eigenvalues have been proved to exist for flows with 
standard metrics. The case that has received the 
most attention in the literature is the “relatively 
simple" case of plane parallel shear flow. The 
eigenvalue problem is governed by the Rayleigh 
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equation (which is the inviscid version of the Orr— 
Sommerfeld equation [11]): 


Mi 3 : 


p=0 at zd [18] 


The celebrated Rayleigh stability criterion says that 
a sufficient condition for the eigenvalues A to be 
pure imaginary is the absence of an inflection point 
in the shear profile U(z). It is more difficult to prove 
the converse; however, there have been several 
recent results that show that oscillating profiles 
indeed produce unstable eigenvalues. For example, if 
U(z)= sin mz the continued fraction proof of 
Meshalkin and Sinai can be adapted to exhibit the 
full unstable spectrum for [18]. We note the “fluid 
Lyapunov exponent” A is zero for all shear flows; 
thus the only way the unstable spectrum can be 
nonempty for shear flows is via discrete unstable 
eigenvalues. 

As we have discussed, it is possible to show that 
many classes of steady Euler flows are linearly 
unstable, either due to a nonempty unstable essential 
spectrum (i.e., cases where A > 0) or due to unstable 
eigenvalues or possibly for both reasons. It is natural 
to ask what this means about the stability/instability 
of the full nonlinear Euler equations [14]-[16]. The 
issue of nonlinear stability is complex and there are 
several natural precise definitions of nonlinear 
stability and its converse instability. 

One definition is to consider nonlinear stability 
in the energy norm LŽ and the enstrophy norm H', 
which are natural function spaces to measure 
growth of disturbances but are not “correct” spaces 
for the Euler equations in terms of proven proper- 
ties of existence and uniqueness of solutions to the 
nonlinear equation. Falling under this definition is 
the most frequently employed method to prove 
nonlinear stability, which is an elegant technique 
developed by Arnol'd (cf. Arnol'd and Khesin 
(1998) and references therein). This is based on 
the existence of the so-called energy-Casimirs. The 
vorticity curl g is transported by the motion of 
the fluid so that at time £ it is obtained from the 
vorticity at time t=0 by a volume-preserving 
diffeomorphism. In the terminology of Arnol'd, 
the velocity fields obtained in this manner at any 
two times are called isovortical. For a given field 
Uo(x), the class of isovortical fields is an infinite- 
dimensional manifold M, which is the orbit of the 
group of volume-preserving diffeomorphisms in the 
space of divergence-free vector fields. The steady 
flows are exactly the critical points of the energy 
functional E restricted to M. If a critical point is a 


strict local maximum or minimum of E, then the 
steady flow is nonlinearly stable in the space Jı of 
divergence-free vectors u(x,t) (satisfying the bound- 
ary conditions) that have finite norm, 


tell), = lllz + [leurl e], [19] 
This theory can be applied, for example, to show 
that any shear flow with no inflection points in the 
profile U(z) is nonlinearly unstable in the function 
space Jų, that is, the classical Rayleigh criterion 
implies not only spectral stability but also nonlinear 
stability. 

We note that Arnol’d’s stability method cannot be 
applied to the Euler equations in three dimensions 
because the second variation of the energy defined 
on the tangent space to M is never definite at a 
critical point Up(x). This result is suggestive, but 
does not prove, that most Euler flows in three 
dimensions are nonlinearly unstable in the Arnol’d 
sense. To quote Arnol’d, in the context of the Euler 
equations “there appear to be an infinitely great 
number of unstable configurations.” 

In recent years, there have been a number of 
results concerning nonlinear instability for the 
Euler equation. Most of these results prove non- 
linear instability under certain assumptions on the 
structure of the spectrum of the linearized Euler 
operator. To date, none of the approaches prove 
the definitive result that in general linear instability 
implies nonlinear instability. As we have remarked, 
this is a much more delicate issue for Euler than for 
Navier-Stokes because of the existence, for a 
generic Euler flow, of a nonempty essential 
unstable spectrum. To give a flavor of the mathe- 
matical treatment of nonlinear instability for the 
Euler equations, we present one recent result and 
refer the interested reader to articles listed in the 
“Further reading” section for further results and 
discussions. 

In the context of Euler equations in two dimen- 
sions, we adopt the following definition of Lyapu- 
nov stability. 


Definition 4 An equilibrium solution Up(x) is 
called Lyapunov stable if for every £ > O there exists 
6 > O so that for any divergence-free vector u(x, 0) € 
W*sP, s > 2/p, such that |[u(x, 0)||;2 < 6 the unique 
solution u(x,t) to [14]-[16] satisfies 


lu(x,£)|;; <£ for t€ (0,00) 


We note that we require the initial value u(x, 0) to 
be in the Sobolev space W!**?, s > p/2, since it is 
known that the 2D Euler equations are globally in 
time well posed in this function space. 


Definition 5 Any steady flow Up(x) for which the 
conditions of Definition 4 are violated is called 
nonlinearly unstable in LŽ. 


Observe that the open issues (in three dimensions) 
of nonuniqueness or nonexistence of solutions to 
[14]-[16] would, under Definition 5, be scenarios 
for instability. 


Theorem 6 (Nonlinear instability for 2D Euler 
flows). Let Uo(x) € C*(T?) be satisfy [12]. Let A 
be tbe maximal Lyapunov exponent to the ODE 
x= Uo(x). Assume that there exists an eigenvalue A 
in the L? spectrum of the linear operator Lg given 
by [15] with ReA >A. Then in the sense of 
Definition 5, Uo(x) is Lyapunov unstable with 
respect to growth in the L?-norm. 


The proof of this result is given in Vishik and 
Friedlander (2003) and uses a so-called “bootstrap” 
argument whose origins can be found in references 
in that article. We remark that the above result gives 
nonlinear instability with respect to growth of the 
energy of a perturbation which seems to be a 
physically reasonable measure of instability. 

In order to apply Theorem 6 to a specific 2D flow 
it is necessary to know that the linear operator Lg 
has an eigenvalue with ReA >A. As we have 
discussed, such knowledge is lacking for a generic 
flow Uo(x). Once again, we turn to shear flows. As 
we noted A — 0 for shear flows, any shear profile for 
which unstable eigenvalues have been proved to 
exist provides an example of nonlinear instability 
with respect to growth in the energy. 

We conclude with the observation that it is 
tempting to speculate that, given the complexity 
of flows in three dimensions, most, if not all, such 
inviscid flows are nonlinearly unstable. It is clear 
from the concept of the fluid Lyapunov exponent 
that stretching in a flow is associated with 
instabilities and there are more mechanisms for 
stretching in three, as opposed to two, dimensions. 
However, to date there are virtually no mathema- 
tical results for the nonlinear stability problem for 
fully 3D flows and many challenging issues remain 
entirely open. 
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Introduction 


The theorem on stability of matter is one of the most 
celebrated results in mathematical physics. It is one 
of the rare cases where a result of such great 
importance to our understanding of the world 
around us appeared first in a completely rigorous 
formulation. 

Issues of stability are, of course, extremely impor- 
tant in physics. One of the major triumphs of the 
theory of quantum mechanics is the explanation it 
gives of the stability of the hydrogen atom (and the 
complete description of its spectrum). Quantum 
mechanics or, more precisely, the uncertainty princi- 
ple explains not only the stability of tiny microscopic 
objects, but also the stability of gigantic stellar 
objects such as white dwarfs. Chandrasekhar's 
famous theory on the stability of white dwarfs 
required, however, not only the usual uncertainty 
principle, but also the Pauli exclusion principle for 
the fermionic electrons. 

Whereas both the stability of atoms and the 
stability of white dwarfs were early triumphs of 
quantum mechanics, it, surprisingly, took nearly 
40 years before the question of stability of everyday 
macroscopic objects was even raised (Fisher and 
Ruelle 1966). The rigorous answer to the question 
came shortly thereafter in what came to be known 
as the “theorem on stability of matter" proved first 
by Dyson and Lenard (1967). 

Both the stability of hydrogen and the stability of 
white dwarfs simply mean that the total energy of 
the system cannot be arbitrarily negative. If there 
were no such lower bound to the energy, one would 
have a system from which it would be possible, in 
principle, to extract an infinite amount of energy. 
One often refers to this kind of stability as stability 
of the first kind. 

Stability of matter is somewhat different. Stability 
of the first kind for atoms generalizes, as noted later, 
to objects of macroscopic size. The question arises 
as to how the lowest possible energy depends on the 
size or, more precisely, on the (macroscopic) number 
of particles in the object. Stability of matter in its 
precise mathematical formulation is the requirement 
that the lowest possible energy depends at most 
linearly on the number of particles. Put differently, 
the lowest possible energy calculated per particle 


cannot be arbitrarily negative as the number of 
particles increases. This is often referred to as 
"stability of the second kind." If stability of the 
second kind does not hold, one would be able to 
extract an arbitrarily large amount of energy by 
adding a single atomic particle to a sufficiently large 
macroscopic object. 

A perhaps more intuitive notion of stability is 
related to the volume occupied by a macroscopic 
object. More precisely, the volume of the object, 
when its total energy is close to the lowest possible 
energy, grows at least linearly in the number of 
particles. This volume dependence is a fairly simple 
consequence of stability of matter as formulated 
above. 

The first mention of stability of the second kind 
for a charged system is perhaps by Onsager (1939), 
who studied a system of charged classical particles 
with a hard core and proved the stability of the 
second kind. The proof of stability of matter by 
Dyson and Lenard, which does not rely on any hard- 
core assumption, but rather on the properties of 
fermionic quantum particles, used results from 
Onsager's paper. 

The real relevance of the notion of stability of the 
second kind was first realized by Fisher and Ruelle 
(1966) in an attempt to understand the thermo- 
dynamic properties of matter and to give meaning 
to thermodynamic quantities such as the energy 
density (energy per volume). Stability of matter is a 
necessary ingredient in explaining the existence of 
thermodynamics, that is, that the energy per 
volume has a well-defined limit as the volume and 
number of particles tend to infinity, with the ratio 
(i.e., the density of particles) kept fixed. The 
existence of this limit is, however, not just a simple 
consequence of stability of matter. The existence of 
the thermodynamic limit for ordinary charged 
matter was proved rigorously by Lieb and Lebowitz 
(1972) using the result on stability of matter as an 
input. 

After the original proof of stability of matter by 
Dyson and Lenard, several other proofs were given 
(see, e.g., reviews by Lieb (1976, 1990, 2004) for 
detailed references). Lieb and Thirring (1975) in 
particular presented an elegant and simple proof 
relying on an uncertainty principle for fermions. As 
explained in a later section, the best mathematical 
formulation of the usual uncertainty principle is in 
terms of a Sobolev inequality. The method of Lieb 
and Thirring is related to a Sobolev type inequality 
for antisymmetric functions. The Lieb—Thirring 
inequality is discussed later. The proof by Dyson 
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and Lenard gave a very poor bound on the lowest 
possible energy per particle. The proof by Lieb and 
Thirring gave a much more realistic bound on this 
quantity (see below). Two proofs of stability of 
matter will be sketched here. Both proofs rely on the 
Lieb-Thirring inequality. The first proof described is 
mathematically simple to explain, whereas the 
second proof (Lieb-Thirring) is based on the 
Thomas-Fermi theory. It is mathematically some- 
what more involved but, from a physical point of 
view, more intuitive. 

As in the case of white dwarfs, stability of matter 
relies on the fermionic property of electrons. Dyson 
(1967) proved that the stability of the second kind 
fails if we ignore the Pauli exclusion principle. In 
physics textbooks, the importance of the Pauli 
exclusion principle for the stability of white dwarfs 
is often emphasized. Its importance for the stability 
of everything around us is usually ignored. 

As mentioned above the result on stability. of 
matter appeared from the beginning as a completely 
rigorously proved theorem. In contrast, the stability 
of white dwarfs was only derived rigorously by Lieb 
and Thirring (1984) and Lieb and Yau (1987) over 
50 years after the original work of Chandrasekhar. 

The original formulation of stability of matter, 
which is given in the next section, dealt with 
charged matter consisting of electrons and nuclei 
interacting only through electrostatic interactions 
and being described by nonrelativistic quantum 
mechanics. Over the years, many generalizations of 
stability of matter have been derived in order to 
include relativistic effects and electromagnetic inter- 
actions. Some of these generalizations will be 
discussed in this article. A complete understanding 
of stability of matter in quantum electrodynamics 
(QED) does not exist as yet, which is intimately 
related to the fact that this theory still awaits a 
mathematically satisfactory formulation. 


The Formulation of Stability of Matter 


Consider K nuclei with nuclear charges z;,...,zy > 0 
at positions 7;,...,7k € R^, and N electrons with 
charges —1 (this amounts to a choice of units) at 
positions x1,..., xy € R^. In order to discuss 
stability, it turns out that one can consider the 
nuclei as fixed in space, whereas the electrons are 
dynamic. More precisely, this means that the 
kinetic energy of the nuclei is ignored. It is 
important to realize that if stability holds for static 
nuclei, it also holds for dynamic nuclei. This is 
simply because the kinetic energy is positive, so that 
the effect of ignoring it is to lower the total energy. 


Since we consider only electrostatic interactions, 
the quantum Hamiltonian describing this system is 


N K N Zp 
mr PA 
i=1 == Ix; — fg 
1 Zee 
eI cot 2 ea [f 


The kinetic energy operator T; is (half) the Laplacian in 
the variable x;, i.e., T; = —(1/2)A;. Atomic units are 
used, where not only the electron charge is —1, but the 
mass of the electron is also 1 and 5 — 1. The unit of 
energy is then 2 Ry. 

The Hamiltonian Hy depends on the parameters 
Z-(f1..Zk) and f—i(fq....Tk) It acts on the 
Hilbert space of fermionic, that is, antisymmetric 
wave functions. More precisely, the fermionic 
Hilbert space is 


N 
tte = NL (RC) 
Here the target space is C?, in order to describe 


spin-1/2 particles. One can, of course, also consider 
the Hamiltonian Hy on the full Hilbert space, 


N 
ty = LR? C9) = ARE CA 


of which HÑ, is a subspace. 
The quantity of interest is the ground-state energy 


EF(z, N, K) = inf inf specs Hx 
= inf inff (V, Hy) |y 


EHE n C9(RPN E”, 


wj=1} [2 


and likewise for the ground-state energy E(z, N, K) 
on the full space Hy. Clearly, E'(z,N,K) > 
E(z, N, K). It turns out that the energy E(z, N, K) is 
the same as one would get by restricting to 
symmetric functions instead of antisymmetric 
ones. Therefore, the energy E(z,N,K) is often 
referred to as the lowest possible energy for bosonic 
particles. 

The Hamiltonian Hy is an unbounded operator 
and we must discuss its domain to be able to talk 
about its spectrum. Also, it should be self-adjoint. It 
turns out that these questions are intimately related 
to stability. The operator Hy is well defined on 
smooth (i.e., C**) functions. Thus, the last definition 
of EF (z, N, K) in [2] is meaningful. If this ground-state 
energy is finite (i.e., not —oo), then the Hamiltonian 
has an extension, the Friedrichs’ extension, to a 
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self-adjoint operator with the property that the 
second equality in [2] holds. 

In the definition of EF, we have minimized over 
all the positions r of the nuclei. Even though the 
nuclear dynamics is not considered, one is still 
interested in finding the lowest possible energy 
independent of where they are located. 


Theorem 1 (Stability of the first kind). 
K, and z, we have 


For all N, 


E(z, N, K) > —00 


Theorem 2 (Stability of matter). There exists a 
constant C > 0 depending only on |z|— max 
[£15 «+ oy £x] such that 


E" (z, N, K) > -Ca (N +K) 


The constant C; bounds the binding energy per 
particle. In the case of hydrogen atoms, when 
Iz| — 1, Dyson and Lenard arrived at a bound with 
C, ~ 10'^ Ry. Lieb and Thirring arrive at C, = 
5=10Ry. Since the binding energy of a single 
hydrogen atom is 1 Ry, it is easy to see that one 
must have C, > 1/4. Over the years, there have 
been some improvements on the estimated value of 
this constant in the theory of stability of matter. 

That the Pauli exclusion principle, that is, the 
fermionic character of the electrons, is necessary for 
stability of matter is a consequence of the next 
theorem. 


Theorem 3 (the N?? law for bosons). If N=K 
and zi = ++: —zky —z > 0, then there exist constants 
C+ > 0 depending on z such that 


-CN e Efe NN) < —C, NS? 


It is the superlinear (exponent 5/3) behavior in N 
of the upper bound that violates stability of matter. 
This upper bound was proved by Lieb (1979) by a 
fairly simple variational argument. The lower bound 
above, which shows that the exponent 5/3 is 
optimal, was proved by Dyson and Lenard (1968) 
in their original paper on stability of matter. 

This theorem leaves open the possibility that the 
stability of matter could be recovered by introducing 
finite nuclear masses. That this, indeed, is not the case 
was proved by Dyson (1967) by a complicated 
variational argument based on the Bogolubov pair 
theory for superfluid helium. We now add the kinetic 
energy D a —(1/2)A,, of the nuclei (assuming, for 
simplicity, that they have the same mass as the 
electrons) to the Hamiltonian Hy and consider 
the case where z; = 22 = ++: = ZK = 1. We denote the 


ground-state energy over the space L?(R°\**)) 
(ignoring spin) by E(N,K). Then, Dyson proved that 


min E(N,K) < - CM/^ 

N+K=M 
for some constant C > 0. It was later shown by 
Conlon et al. (1988) that the exponent 7/5 is indeed 
optimal. Dyson (1967) made a conjecture for the 
precise asymptotic behavior of this energy. This 
conjecture, which was proved by Lieb and Solovej 
(2005) and Solovej (2004), is given in the next 
theorem. 


Theorem 4 (Dyson's 7/S-law for the charged 
Bose gas). 


lim i E(N. K) 
Moo NKM M7/5 


=int{ 5 f ivo? -1 [162 [  -1] [3] 


wbere 


—O (4 T(4/2)T (3/4) 
J= ($) ST (5/4) 


Generalizations of Stability of Matter 


Over the years, generalizations of stability of matter 
including relativistic effects and interactions with the 
electromagnetic field have been attempted. Since the 
relativistic Dirac operator is not bounded below, we 
cannot simply replace the standard nonrelativistic 
kinetic energy operator T;= —(1/2)A; by the free 
Dirac operator. 

Relativistic effects have been included by con- 
sidering the (pseudo) relativistic kinetic energy 


rum = 4/ —c* A; + EN — 


In the units used in this article, the physical value 
of the speed of light c is approximately 137 or, 
more precisely, the reciprocal of the fine-structure 
constant a. 

For this relativistic kinetic energy, Lieb and Yau 
(1988) proved that stability of matter holds in the 
sense formulated in Theorem 2 if a(=c*) is small 
enough and max; [z;]oa € 2/7. It is known here that 
the value 2/7 is the best possible, since it is so 
for the one-atom case. The one-atom case had 
been studied by Herbst. The corresponding case of 
a one-electron molecule was studied by Lieb and 
Daubechies. Less optimal results on the stability of 
matter with relativistic kinetic energy had been 


obtained prior to the work of Lieb and Yau by 
Conlon and later by Fefferman and de la Llave. 
References to these works can be found in the work 
of Lieb and Yau (1988). 

The relativistic kinetic energy qe agrees with the 
free Dirac operator on the positive spectral subspace 
of the free Dirac operator (i.e. a subspace of 
L(R?;C^). Therefore, the stability of matter 
follows if T; is replaced by the free Dirac operator 
and if one restricts to the Hilbert space obtained as 
in [2] but with L2(R?; C7) replaced by the positive 
spectral subspace of the free Dirac operator. This 
formulation is often referred to as the “no-pair” 
model. In the usual Dirac picture, the negative 
spectral subspace, the Dirac sea, is occupied. As long 
as one ignores pair creation, only the positive 
spectral subspace is available. 

Magnetic fields may be included by considering 
the “magnetic kinetic energy” 


p^ = i(-iV; — c A(x)) 


It turns out that the stability of matter theorem 
(Theorem 2) holds for all magnetic vector potentials 
A:R? — R? with a constant C, independent of A. 
This is, therefore, also the case if we consider the 
magnetic field (or rather the vector potential) as a 
dynamic variable and add the (positive) field energy 


1 


U = — 
ST JR? 


IV x A(x)|? dx [4] 
to the Hamiltonian. The resulting Hamiltonian 
describes a charged spinless particle interacting 
with a classical electromagnetic field. 

A more complicated situation is described by the 
“magnetic Pauli kinetic energy” 


Tee T +((—iV; = c 'A(x;)) z oj)” 


where the coupling of the spin to the magnetic field 
is included through the vector of 2x2 Pauli 
matrices acting on the spin components of particle /, 
that IS, O = (01,02, 03), with 


(0 1 (0 -i fi ü 
A31 0) "is pl “Lg =i 


For the Pauli kinetic energy, stability of matter will 
not hold independently of the magnetic field (or even 
for a fixed unbounded magnetic field) unless the field 
energy U in [4] is included in the Hamiltonian. If the 
field energy is included, stability of matter holds 
independently of the magnetic field, that is, even if 
one minimizes over the dynamic variable A, if 
o(— c) and max; {z;}a* are small enough. This was 
proved by Fefferman (1997) and by Lieb et al. 
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(1995). The latter result includes the physical value of 
a. The fact that a bound on o is needed had been 
proved by Loss and Yau. Stability for a one-electron 
atom had been proved in this model by Fróhlich, 
Lieb, and Loss. The many-electron atom and the one- 
electron molecule had been studied by Lieb and Loss. 
Most relevant references may be found in the work of 
Lieb et al. (1995). 

The possibility of quantizing the magnetic field has 
also been studied. In this case, one must introduce an 
ultraviolet cutoff in the momentum modes of the 
vector potential. Stability of matter in the resulting 
model of (ultraviolet cutoff) QED coupled to non- 
relativistic matter was proved by Fefferman et al. 
improving results of Bugliaro, Fróhlich, and Graf. 

Finally, one may include both relativistic effects 
and electromagnetic interactions. Let us first discuss 
the case of classical electromagnetic fields. If instead 
of the Pauli kinetic energy one uses the Dirac 
operator with a magnetic vector potential then 
there would be no lower bound on the energy. But, 
as previously described, one can study a no-pair 
formulation of relativistic particles coupled to 
electromagnetic fields. The question arises which 
subspace of L?(R?; C^) one should restrict to (i.e., 
which subspace is filled and which one is available). 
There are two obvious choices. Either one should, as 
before, restrict to the positive spectral subspace of 
the free Dirac operator or one should restrict to the 
positive spectral subspace of the magnetic Dirac 
operator. It is proved by Lieb et al. (1997) that the 
former choice leads to instability, whereas stability 
of matter holds for the latter choice under some 
conditions on a and max; [2;]. Stability requires that 
the field energy U is included in the Hamiltonian. It 
then holds independently of the magnetic field. 

This final stability result also holds if the magnetic 
field is quantized with an ultraviolet cutoff as 
proved by Lieb and Loss (2002). 

The no-pair model even with the ultraviolet cutoff 
quantized field is not fully relativistically invariant. 
As mentioned above, there is still no mathematical 
formulation of QED, a fully relativistically invariant 
model for quantum particles interacting with elec- 
tromagnetic fields. 


The Proof of Stability of the First Kind 


The proof of stability of the first kind will now be 
sketched for charged quantum systems. 

As mentioned in the introduction, stability of the 
first kind is a consequence of the uncertainty 
principle. Contrary to what is often stated in physics 
textbooks, stability does not follow from the 
Heisenberg formulation of the uncertainty principle. 
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A mathematically more flexible formulation 
is provided by the classical Sobolev inequality, 
which states that for all square-integrable functions 
v € L'(R?), one has 


/ [Va]? > Cs f w) [5] 


for Cs > 0. It follows from this inequality that for 
any attractive potential V, there is a lower bound on 
the energy expectation 


1 
(v (-34 =a v) 6) 
=5 | ve- | uz 5s( fv) 
2/5 1/5 
peje Un 
> -C | vs? f uf 


for some C » 0. Thus, the lowest possible energy of 
one particle moving in the potential V is bounded 
below by —C f V°/*. For N (noninteracting) particles, 
the lower bound is —CN f V?/?, This holds whether 
or not the particles have spin. If, more generally, the 
potential can be written as V— U + W,U, W > 0, 
where f U?/? < œo and W is bounded W < ||W ||.., 
then the energy of N noninteracting particles moving 
in the potential V is bounded below by 


-NC f US — NW. le 


For the Hamiltonian Hy from [1], one can get a 
lower bound on the energy E(z, N, K) by ignoring all 
the positive potential terms, that is, the last two 
sums in [1]. The remaining Hamiltonian describes N 
independent particles moving in the potential 


K 


K 
= E | uc NS -5 


k=] 


where U, is the restriction of z,/|x — r,| to the set 
Ix — r| < R for some R > 0 and W; is the restriction 
to the complementary set. Using [6], one can easily 
see that the energy expectation is bounded below by 


-GNE max(z,] R"? — NKmax(z,)R ^ 
——CNK max(z4)" 
where we have made the optimal choice for 
R ~ (K max; [z4)) 7. 


This finite lower bound on the energy proves 
the stability of the first kind, but it clearly does 


not have the form required for the stability of the 
second kind. 


The Proof of Stability of Matter 


The proof of stability of the first kind presented in 
the previous section must be improved in two ways 
in order to conclude the stability of matter. 

For fermions, it turns out that the lower bound in 
[6] can be improved in such a way that there is no 
factor N in the first term. This is the content of the 
bound of Lieb and Thirring discussed in the 
introduction. 


Theorem 5 (Lieb-Thirring inequality 1975). The 
sum of all the negative eigenvalues of the oper- 
ator —(1/2)A — V(x) is bounded below by 


= J vin 


for some constant Lir > 0 


For N noninteracting fermions moving in the 
potential V, the lowest possible energy is given by 
the sum of the N lowest eigenvalues of the operator 
in the above theorem. Thus, the theorem gives a 
lower bound on this energy independently of N. 

The second point where the argument from the 
previous section has to be improved is the control of 
the electrostatic energy. In the above discussion, all 
repulsive terms have simply been ignored. For 
stability of matter, a much more delicate bound is 
needed. Many versions of such bounds have been 
given going back to the work of Onsager (1939). 
Here, a result of Baxter (1980) will be used. 


Theorem 6 (Baxter's correlation estimate). For all 
positions Xj,...,XN, Y1,...,1k € R? and all charges 
Z1,...,ZK > 0, we have the pointwise inequality 


S Zk 1 
2. + d 
as |x; i "i| 1<i<j<N |x; E xj 


K 
k= 


where V(x) — (1 +2 max, {z,}) max, {|x — re). 

This theorem simply states that, for a lower 
bound, one can replace the full electrostatic Cou- 
lomb energy by the energy of independent electrons 
moving in the potential where they always see only 
the closest nuclei (with a modified charge). Baxter 
(1980) used probabilistic techniques to prove the 
inequality. Án improved version of the inequality 
was given by Lieb and Yau (1988), with an analytic 
proof. 


Similarly to the argument in the previous section, 
one can write V(x) = U(x) + W(x), where U is the 
restriction of V to the set where min, {|x — r|} < R 
for some R>O and W is the restriction to the 
complementary set. It then follows from Baxter’s 
correlation estimate and the Lieb-Thirring inequality 
that the lowest eigenvalue of the Hamiltonian Hy on 
the fermionic Hilbert space Hy, is bounded below by 


- Lar f U' ^ - N(1 + 2 max(z4))R 
> —C(1 + 2max(z,)) KR"? 
- N(1 + 2max(z,])R 
= —C'(1 + 2max(z,]) (N + K) 


where R ~ (1 -- 2max;(z;]) *. This lower bound is 
linear in the total particle number N+ K, as 
required by stability of matter. 


From Thomas-Fermi Theory to Stability 
of Matter 


In this final section, the proof of stability of matter 
by Lieb and Thirring (1975), where they use the 
Thomas-Fermi theory, is discussed briefly. First note 
that there is a dual formulation of the Lieb-Thirring 
inequality theorem (Theorem 5), which makes the 
connection to the Sobolev inequality [5] much more 
transparent. 


Theorem 7 (Lieb-Thirring inequality as a kinetic 
energy bound). For any normalized antisymmetric 
(fermionic) wave function Y € Hi, we have with 
Cur = i(eL LR the following lower bound on the 
kinei energy: 

M. 1 

5 / IV ¿U(x1,... mn) lló doer -- - den 

1 R 3N 


> Cur | p(x)? dx 
R? 


el ) and the 


where ||- || is the norm in spin space ( 
one-electron density is given by 


Men], 
This estimate follows immediately from Theorem 
5, which implies that 


N 
1 
al NS foy > -tr | pe 


Theorem 7, 


xN)||* dx» - -- dxw 
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To arrive at choose 


V=((2/5(L 10%. 


simply 
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One should compare the Lieb-Thirring kinetic 
energy bound with the expression (3/10)(372)*/?p°/3 
for the (thermodynamic) energy density of a 
free Fermi gas. One of the yet unproven conjectures 
is that the Lieb-Thirring bound holds with Cir 
replaced by the free Fermi constant (3/10)(322)?, 

The idea in the Lieb-Thirring proof of stability of 
matter is to bound the energy below by an 
expression depending only on the one-electron 
density. Theorem 7 achieves this for the kinetic 
energy. What is missing is a lower bound on the 
electrostatic Coulomb energy depending only on the 
density. One can show (see Lieb (1976) or Lieb and 
Thirring (1975)) that, except for an error of the 
form “—const x N,” the total energy expectation 
(V, Hy V) may be bounded below by 


K 
C | /^- [oo dx 
i P E al 
1 p(x)p(y) ZE Ze 
+5 || dx dy + 7 
o a 


<k<l<K 


Here, as before, p is the one-electron density of the 
N-body wave function Y. The expression [7] is the 
famous Thomas-Fermi energy functional. It has 
been studied rigorously by Lieb and Simon (1977). 
The Thomas-Fermi energy is the infimum of the 
expression (7) over all p with f p— N. One of the 
important results about the Thomas-Fermi energy is 
Teller's no-binding theorem (Lieb and Simon 1977). 
It states that in Thomas-Fermi theory atoms do not 
bind to form molecules. This means that the 
Thomas-Fermi energy is greater than the sum of 
the individual atomic energies (these energies in turn 
depend only on the nuclear charges). 

The above Thomas-Fermi lower bound on the 
energy expectation (V, HX V) together with the no- 
binding theorem implies stability of matter. 

The generalizations to stability of matter dis- 
cussed earlier are proved in a way similar to the 
proof presented in the previous section. 


See also: h-Pseudodifferential Operators and 
Applications; Quantum Statistical Mechanics: Overview; 
Schródinger Operators. 


Further Reading 


Baxter JR (1980) Inequalities for potentials of particle systems. 
Illinois Journal of Mathematics 24: 645-652. 

Conlon JG, Lieb EH, and Yau H-T (1988) The N^ law for 
charged bosons. Communications in Mathematical Physics 
116: 417-448. 

Dyson FJ (1967) Ground state energy of a finite system of charged 
particles. Journal of Mathematical Physics 8: 1538-1545. 


14 Stability of Minkowski Space 


Dyson FJ and Lenard A (1967) Stability of matter. I. Journal of 
Mathematical Physics 8: 423-434. 

Dyson FJ and Lenard A (1968) Stability of matter. II. Journal of 
Mathematical Physics 9: 698—711. 

Fefferman CL (1997) Stability of matter with magnetic fields. 
CRM Proceedings and Lecture Notes 12: 119-133. 

Fefferman C, Frohlich J, and Graf GM (1997) Stability of ultraviolet 
cutoff quantum electrodynamics with non-relativistic matter. 
Communications in Mathematical Physics 190: 309-330. 

Fisher M and Ruelle D (1966) The stability of many-particle 
systems. Journal of Mathematical Physics 7: 260-270. 

Lieb EH and Lebowitz JL (1972) The constitution of matter: 
existence of thermodynamics for systems composed of 
electrons and nuclei. Advances in Mathematics 9: 316-398. 

Lieb EH and Thirring WE (1975) Bound for the kinetic energy of 
fermions which proves the stability of matter. Physical Review 
Letters 35: 687-689. 

Lieb EH (1976) The stability of matter. Reviews of Modern 
Physics 48: 553-569. 

Lieb EH and Simon B (1977) Thomas-Fermi theory of atoms, 
molecules and solids. Advances in Matbematics 23: 22-116. 

Lieb EH (1979) The N?? law for bosons. Physics Letters A 70: 
71-73. 

Lieb EH and Thirring WE (1984) Gravitational collapse in 
quantum mechanics with relativistic kinetic energy. Annals of 
Physics, NY 155: 494-512. 

Lieb EH and Yau H-T (1987) The Chandrasekhar theory of 


stellar collapse as the limit of quantum mechanics. Commu- 
nications in Mathematical Physics 112: 147-174. 


| Stability of Minkowski Space 


- S Klainerman, Princeton University, Princeton, NJ, 
- USA 
- © 2006 Elsevier Ltd. All rights reserved. 


Introduction 


The Minkowski space, which is the simplest solution 
of the Einstein field equations in vacuum, that is, in 
the absence of matter, plays a fundamental role in 
modern physics as it provides the natural mathema- 
tical background of the special theory of relativity. It 
is most reasonable to ask whether it is stable under 
small perturbations. In other words, can arbitrary 
small perturbations of flat initial conditions lead to 
developments which are radically different, in the 
large, from the flat Minkowski space? It turns out to 
be a highly nontrivial problem as the Einstein 
equations are of a quasilinear hyperbolic character. 
Typical systems of this type, in three space dimen- 
sions, do form singularities in finite time even for 
small disturbances of their trivial initial data. To 
avoid finite-time singularities, we must require that 
sufficiently small perturbations of Minkowski space 
are geodesically complete. This, however, is not 
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enough; one should also insist that the corresponding 
spacetimes become flat along all possible directions, 
that is, globally asymptotically flat. This is measured 
by the decay of the curvature tensor to zero. The 
precise rate of decay is also of interest. One expects 
that various null-frame components of the curvature 
tensor decay at different rates along outgoing null 
hypersurfaces; this goes under the name of “peeling 
estimates." It turns out in fact that we cannot prove 
geodesic completeness without establishing at the 
same time sufficiently fast rates of decay to flatness 
corresponding to at least some peeling. 

The problem of stability of Minkowski space is 
intimately related to that of describing the asympto- 
tic properties of the gravitational field at large 
distances from an isolated, weakly radiating physical 
system. Precise laws of gravitational radiation can 
be deduced from the assumption that the spacetime 
(M,g) under consideration can be conformally 
compactified by adding a boundary S, called skry, 
to M so that an appropriate conformal rescaling of g 
can be extended smoothly to the new manifold 
(M,g) with boundary. In reality, the compactified 
spacetime cannot be smooth at the particular point 


i? corresponding to spacelike infinity. A spacetime 


(M,g) is called asymptotically simple (AS) if its 
conformal completion is smooth everywhere except 
i? and every null geodesic intersects S at precisely 
two endpoints. The AS assumption allows one to 
derive precise decay asymptotic for various curvature 
components of (M,g) along null geodesics which 
are referred to as strong peeling. The obvious 
questions raised by this procedure are: do there exist 
nontrivial AS spacetimes and, if so, do they contain 
a sufficiently large class of radiating spacetimes 
including those which appear in all relevant 
applications? 

Clearly, the two problems mentioned above are 
related but not equivalent. Asymptotically simple 
spacetimes verify strong peeling, in particular they 
are globally asymptotically flat, that is, their 
curvature tensor tends to zero along all geodesics. 
Yet, it is perfectly possible that arbitrarily small 
perturbations of the Minkowski space are geodesi- 
cally complete and globally asymptotically flat 
without being asymptotically simple. 

The first global stability result of the Minkowski 
metric was proved by Christodoulou and Klainer- 
man (1993). Their result proves sufficiently strong 
peeling estimates to allow one to derive the most 
important properties of gravitational radiation, such 
as the Bondi mass-law formula, but not as strong as 
those consistent with asymptotic simplicity. A 
companion result was proved by Klainerman and 
Nicolò (2003). Recently, Rodnianski and Lindblad 
(submitted) have obtained a surprising global 
stability of Minkowski result for the Einstein vacuum 
equations in the Lorentz gauge, which provides 
considerable weaker peeling than Christodoulou and 
Klainerman (1993) and Klainerman and Nicolò 
(1999) but is much easier to prove. 

The goal of this article is to describe various results 
obtained since the early 1980s concerning both 
aspects of the problem of stability of Minkowski 
mentioned above. 


Initial Data Formulation 


The proper mathematical context for the stability of 
Minkowski is that provided by the initial-value 
problem for vacuum solutions to the Einstein field 
equations, that is, Ricci flat spacetimes (M, g), 
R,,, — 0. We recall the following simple definitions: 


Definition 1 An initial data set is a triplet (2, g, k) 
consisting of a three-dimensional complete Rieman- 
nian manifold (£,g) and a 2-covariant symmetric 
tensor k on X satisfying the constraint equations: 


Vik; Vitr. k —0, — R — |k} + (trk? =0 
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where V is the covariant derivative, R the scalar 
curvature of (2, g). An initial data set is said to be 
maximal if tr, & — 0. This is a gauge condition which 
can be imposed without loss of generality. For 
simplicity we shall assume, throughout this article, 
that all initial data sets we consider are maximal. 


Definition 2 An initial data set is said to be flat, or 
trivial, if it corresponds to a complete spacelike 
hypersurface in Minkowski space with its induced 
metric and second fundamental form. An initial data 
set is said to be asymptotically flat if there exists a 
system of coordinates (x',x^,x?) defined in a 
neighborhood of infinity on PX, with 
r—4/(x! + (x2)? + (x3)?, relative to which the 
metric g approaches the Euclidian metric and k 
approaches zero as r — oc. We assume, for simpli- 
city, that X has only one end. A neighborhood of 
infinity means the complement of a sufficiently large 
compact set on 2, 


Remark 1 Because of the constraint equations, the 
asymptotic behavior cannot be arbitrarily pre- 
scribed. A precise definition of asymptotic flatness 
has to involve the ADM mass of (2, g). Taking the 
mass into account, we write 


2M 


85 = (1 SD 23 o(r !) 


According to the positive-mass theorem, M > 0 and 
M — 0 implies that the initial data set is flat. 


Definition 3 We say that an initial data set is 
strongly asymptotically flat if, for some 61/2, 
relative to the coordinate system mentioned above, 


m (1 Zr zt =}, 


as Y — OO 


ki; = O(r ^7) 


Moreover, every derivative of g — (1 + 2M/r)ó and k 
improves the asymptotics by one. 


Definition 4 A Cauchy development of an initial 
data set (X,g,k) is a spacetime manifold (M, g) 
satisfying the Einstein equations together with an 
embedding ;: 3 — M such that i,(g),4(k) are the 
first and second fundamental forms of ¿(X) in M. 
A development is required to be also globally 
hyperbolic (which means that ;(X) is a Cauchy 
hypersurface, 1.e., each causal curve in M intersects 
i(3) at precisely one point) in order to assure the 
unique dependence of solutions on the data. A 
future development of (2, g, k) consists of a globally 
hyperbolic manifold (M,g) with boundary, satisfy- 
ing the Einstein equations, and an embedding ; as 
before which identifies X to the boundary of M. 
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The most primitive question asked about the 
initial-value problem, solved in a satisfactory way, 
for very large classes of evolution equations, is that of 
local existence and uniqueness of solutions. For the 
Einstein equations, this type of result was first 
established by Bruhat (1952) with the help of wave 
coordinates which allowed her to cast the Einstein 
equations in the form of a system of nonlinear wave 
equations to which one can apply the standard theory 
of symmetric hyperbolic systems. A stronger result, 
due to Hughes et al. (1976), states the following: 


Theorem 1 Let (X,g,k) be an initial data set for 
the Einstein vacuum equations. Assume that Y can 
be covered by a locally finite system of coordinate 
charts U, related to each other by C! diffeomorpb- 
isms, such that (g,k) € Hi, (Ua) x Hi (Us) with 
s 5/2. Then there exists a unique (up to an 
isometry) globally byperbolic, Hausdorff, develop- 


ment (M, g) for which Y is a Cauchy hypersurface. 


In Theorem 1, the uniqueness up to an isometry 
requires additional regularity, s > (5/2) + 1, on the 
data. One has uniqueness, however, without addi- 
tional regularity for the reduced Einstein equations 
system in wave coordinates. 


Remark 2 In the case of nonlinear systems of 
differential equations, the local existence and 
uniqueness result leads, through a straightforward 
extension argument, to a global result. The formula- 
tion of the same type of result for the Einstein 
equations is a little more subtle; it was done by 
Bruhat and Geroch. 


Theorem 2 (Bruhat-Geroch). For each smooth 
initial data set, there exists a unique maximal future 
development. 


Thus, any construction, obtained by an evolution- 
ary approach from a specific initial data set, must be 
necessarily contained in its maximal development. 
This may be said to solve the problem of global 
existence and uniqueness in general relativity. This is 
of course misleading, for equations defined in a fixed 
background global is a solution which exists for all 
time. In general relativity, however, we have no such 
background as the spacetime itself is the unknown. 
The connection with the classical meaning of a global 
solution requires a special discussion concerning the 
proper time of timelike geodesics; all further ques- 
tions may be said to concern the qualitative properties 
of the maximal development. The central issue is that 
of existence and character of singularities. First, we 
can define a regular maximal development as one 
which is complete in the sense that all future timelike 
and null geodesics can be indefinitely extended 


relative to their proper time (or affine parameter in 
the case of null geodesics). If the initial data set is 
sufficiently far off from the trivial one, the corre- 
sponding future development may not be regular. 
This is the content of the following well-known 
theorem of Penrose (1979). 


Theorem 3 If the manifold support of an initial 
data set is noncompact and contains a closed 
trapped surface, the corresponding maximal devel- 
opment is incomplete. 
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At the opposite end of Penrose’s trapped-surface 
condition, the problem of stability of Minkowski 
space concerns the development of asymptotically 
flat initial data sets which are sufficiently close to 
the trivial one. Although it may be reasonable to 
expect the existence of a sufficiently small neighbor- 
hood of the trivial initial data set, in an appropriate 
topology, such that all corresponding developments 
are geodesically complete and globally asymptoti- 
cally flat, such a result was by no means preor- 
dained. First, all known explicit asymptotically 
flat solutions of the Einstein vacuum equations, 
that is, the Kerr family, are singular. The attempts 
to construct nonexplicit, dynamic, solutions based 
on the conformal compactification method, due 
to Penrose (1962), were obstructed by the irregular 
behavior of initial data sets at 7°. (The problem is 
that the singularity at / could propagate and thus 
destroy the expected smoothness of scry. This 
problem has been recently solved by constructing 
initial data sets which are precisely stationary at 
spacelike infinity.) Finally, the attempts, using 
partial differential equation hyperbolic methods, 
to extend the classical local result of Bruhat 
ran into the usual difficulties of establishing global 
in time existence to solutions of quasilinear hyper- 
bolic systems. Indeed, as mentioned above, the 
wave coordinate gauge allows one to express 
the Einstein vacuum equations in the form of 
a system of nonlinear wave equations which does 
not satisfy Klainerman's null condition (the null 
condition (Klainerman 1983, 1986) identifies an 
important class of quasilinear systems of wave 
equations in four spacetime dimensions for which 
one can prove global in time existence of small 
solutions) and thus was sought to lead to formation 
of singularities. (The conjectured singular behavior of 
wave coordinates was sought, however, to reflect 
only the instability of the specific choice of gauge 
condition and not a true singularity of the equations.) 
According to Bruhat (personal communication), 


Einstein himself had reasons to believe that the 
Minkowski space may not be stable. The problem 
of stability of the Minkowski space was first settled 
by Christodoulou and Klainerman (1990). 


Theorem 4 (Global stability of Minkowski). Any 
asymptotically flat initial data set which is suffi- 
ciently close to the trivial one has a complete 
maximal future development. 


A related result (Theorem 5) proved recently by 
Klainerman and Nicolo (2003a), solves the problem 
of radiation for arbitrary asymptotically flat initial 
data sets: a proof the result below can also be 
derived, indirectly, from Christodoulou and Klainer- 
man (1993). The proof of Klainerman and Nicoló 
(2003a) avoids, however, a great deal of the 
technical complications of this proof. 


Theorem 5 For any, suitably defined, asymptoti- 
cally flat initial data set (X,g,k) with maximal 
future development (M,g), one can find a suitable 
domain Qo C X with compact closure in Y such that 
the boundary Dj of its domain of influence C^ (Qo), 
or causal future of Q, in M has complete null 
geodesic generators witb respect to tbe correspond- 
ing affine parameters. 


Both the results of Christodoulou-Klainerman and 
Klainerman-Nicoló prove in fact a lot more than 
stated above. They provide a wealth of information 
concerning the behavior of null hypersurfaces as well 
as the rate at which various components of the 
Riemann curvature tensor approach zero along time- 
like and null geodesics. Here are more precise 
versions for Theorems 4 and 5. 


Theorem 4 (Expanded version). Assume that 
(Y,g,R) is maximal and strong asymptotically 
flat, g—(1-4-2M/r)6-0(r??), k —0(r??) plus 
an appropriate global smallness assumption. We can 
construct complete spacetime (M,g) togetber witb a 
maximal foliation Y, given by tbe level hypersurfaces 
of a time function t and null foliation C,, given by the 
level hypersurfaces of an outgoing optical function u 
such that relative to an adapted null frame e4 — L, 
es = L, and (e;), .,.» we have, along the null hyper- 
surfaces C, the weak peeling decay, 


Qab = R(L,ea, L, ey) = O(r??) 
28, = R(L, L, L,e;) = O(r 7?) 
( 


PM k [1] 
40 =* R(L,L, L, L) = O(r?) 
2p. = LE, [os L ez) = O(r 7) 
ay = R(L.ea L, ep) = O(r !) 
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as roo with 4rr = Area(S;,— X^ !C,). Also, 
p — p,a = O(r7/?), with p the average of p over the 
compact 2-surfaces S, u = X N Cy. 


Three points are noteworthy. (1) The outgoing 
optical solution refers to the solution of the Eikonal 
equation g^^9,u053u — 0 whose level hypersurfaces 
C, intersect X, in expanding wave fronts for 
increasing £; (2) the generators L and L are given 
by: L= —g^?05u0,, the null geodesic generator of 
C,; L 1s then the null conjugate of L, perpendicular 
to Sy u = C, N 2,3 and (3) e, is an orthonormal frame 
on S; ,. 


Theorem 5 (Expanded version). For amy asympto- 
tically flat initial data sets (Y, g, k), verifying the same 
asymptotically flat conditions as in Theorem 4 one 
can find a suitable domain Qo C X with compact 
closure in Y such that its future domain of influence 
C*(Qo) can be foliated by two null foliations; one 
outgoing C(u) whose leaves are complete towards the 
future and the second one C(u) which is incoming. 
Let S(u,u)=C(u)n C(u) denote tbe compact 
2-surfaces of intersection between the outgoing and 
incoming null hypersurfaces, whose area is denoted 
by 4xr’, and consider an adapted null frame (that is, 
L is a tbe geodesic null generator of C(u), L its null 
conjugate perpendicular to S(u,u), and e, an ortbo- 
normal frame on S(u,u)) L,L,(e;), 1» at every 
point along an outgoing null cone C(u). Then, 
denoting by a,B,p,0,B,a the null components of 
the curvature tensor, as in Theorem 5, we have, along 
C(u) as r — oo, 


Qr, B, p - p,a = Olr, 
a — O(r") 

Observe that the rates of decay in [1] and [2] are 
the same. This will be referred to as weak peeling to 
distinguish from the rates of decay compatible with 
asymptotic simplicity, that is, 

a=0(r*), B=O(r*) 
po —O(r?^) B-O(r*) 


a = O(r^!) 


to which we shall refer as strong peeling. We shall 
discuss more about these in the next section, 
following a review, of a recent result of Lindblad- 
Rodnianski. 

Even the expanded forms of Theorems 4 and 5 
stated here do not exhaust, all the information 
provided by global stability results in Christodoulou 
and Klainerman (1993) and Klainerman and Nicoló 
(2003a). Of particular interest are the main 
asymptotic conclusions which can be derived 
with the help of these information, the most 
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important being the Bondi mass-law formula which 
calculates the gravitational energy radiated at null 
infinity. 

The simplest gauge condition in which the 
hyperbolic character of the Einstein field equations 
are easiest to exhibit is the wave coordinate 
condition; that is, one solves the Einstein vacuum 
equations relative to a special system of coordinates 
x^ which satisfy the equation Ogx®=0. Then, 
denoting by hag = gag — Mag with m the standard 
Minkowski metric, we obtain the following system 
of quasilinear wave equations in h, 


g"0,0,b = N(b, 0h) [4] 


with N(h,0h) a nonlinear term, quadratic in Oh, 
which can be exhibited explicitly. This form of the 
Einstein field equations, called the wave coordinates 
reduced Einstein equations, is precisely the one 
which allowed Bruhat (1952) to prove the first 
local existence result. Later, she also pointed out 
that the first nontrivial iterate of [4] behaves like 
t !logt rather than t^! as expected from the decay 
properties of solutions to [15 —0 in Minkowski 
space. This seems to indicate that the wave 
coordinates may not be suitable to study the long- 
time behavior of solutions to the Einstein field 
equations. This negative conclusion is also consis- 
tent with the fact that the eqns [4] do not verify 
Klainerman's null condition. (Klainerman's null 
condition (Klainerman 1983) is an algebraic condi- 
tion on systems of nonlinear wave equations in 
(1 + 3) dimensions, similar to [4], which allows one 
to extend all local solutions, corresponding to small 
initial data, for all time. Moreover, these solutions 
decay at the rate of t^! as t — oo consistent to the 
decay of free waves.) Lindblad and Rodnianski 
(2003) were able to isolate a new condition, which 
they call the weak null condition, verified by the 
wave coordinates reduced Einstein eqns [4], for 
which one can prove a small data global existence 
result consistent with the weaker decay rates 
suggested by the linear asymptotic analysis of 
Bruhat. Although the new result provides far 
weaker peeling information than [1], it is much 
simpler to prove than both Theorems 4 and 5. 
Moreover, the result seems to apply to a broader 
class of initial data than in Theorems 4 and 5. It 
remains an intriguing open problem whether the 
result of Lindblad-Rodnianski can be used as a 
stepping stone towards the more complete results of 
Theorems 4 and 5; that it is once a complete 
solution, with limited peeling, is known to exist 
whether one can improve, using the more precise 
techniques employed in Theorems 4 and 5 minus an 


important part of their technical complications, the 
weak peeling properties of [1]. 


Strong Peeling 


The weak peeling properties [1] derived in Theorems 
4 and 5 are consistent, from a scaling point of view, 
with the SAF condition. To derive strong peeling, 
see [3], one needs stronger asymptotic conditions. 
Recently, Corvino-Schoen and Chruściel and Delay 
(2002) have proved the existence of a large class of 
asymptotically flat initial data sets (X,g,k) which 
are precisely stationary (here £kerr, Rkerr are the initial 
data of the a Kerr solution in standard coordinates) 
g = Bkerrs R = Rkerr outside a sufficiently large com- 
pact set. Moreover, they have proved the existence 
of sufficiently small solutions in this class which 
satisfy the requirements needed in Friedrich’s con- 
formal compactification method (see Friedrich 
(2002) and the references within) to produce 
asymptotically simple spacetimes, that is, spacetimes 
satisfying Penrose’s regular compactification condi- 
tion (Penrose 1962). Simultaneously, Klainerman 
and Nicolò (1999) were able to refine the methods 
used in the proof of Theorem 5 to prove the 
following: 


Theorem 6 Assume that the initial data set (©, g, k) 
of Theorem 5 satisfies the stronger assumption, 


g-gs= OF OPM), k=(r62m) 5 
for some ^ > 3/2. Here 


= 
gs = (1 - 27) dr? + r*(d0* + sin? 9 do?) 


denotes the restriction of the Schwarzschild to t=0 
in standard polar coordinates. Then, in addition to 
the results reported in Theorem 5, we have the 
strong peeling estimates, 


a=O(r?),  B-O(r^) 


as r —^ oo along the outgoing null leaves C(u). 
Moreover, the same conclusions hold true if |5] is 


replaced by 


S — Skerr = O(r 9/4), k — bier = (7 G24) [6] 


for some y > 5/2. 


The first part of the theorem was proved in 
Klainerman and Nicoló (2003b). The second part is 
work in progress by Klainerman and Nicolò. The 
existence of initial conditions of the type required in 
Theorem 6 was established in the works of Corvino 
(2000) and Chru$ciel and Delay (2002). 


Open Problems 


Problem 1 Extend results of Theorems 5 and 6 to 
the whole domain of dependence, for small data sets. 


The results of Theorems 5 and 6 give a 
satisfactory description of gravitational radiation of 
general classes of asymptotically flat initial data sets 
outside the domain of dependence of a sufficiently 
large compact set. It would be desirable to extend 
these results to the whole domain of dependence of 
initial data sets which satisfy an additional global 
smallness assumption similar to that of Theorem 4. 


Problem 2 Is strong peeling (and implicitly asymp- 
totic simplicity) consistent witb pbysically relevant 
data? If not, is weak peeling a good substitute? 


Damour and Christodoulou (2000) have given 
conclusive evidence that under  no-incoming- 
radiation condition the future null infinity cannot 
be smooth. In fact, 2— O(r^ logr) as r — oc. 


Problem 3 Can one weaken the AF conditions to 
include, for example, initial data sets with infinite 
ADM angular momentum? 


It is reasonable to expect a global stability of 
Minkowski result for small initial data sets which 
verify, for arbitrarily small e, 


g— (1 + ia =0r +9, k =0(1 29) 


One expects in this case that the top null components 
o and 8 decay only like O(r?) as r — oo along the 
null hypersurfaces C(u). It seems that the methods of 
Lindblad-Rodnianski can treat this case but can only 
give decay estimates for a, 3 of the form O(r?**). 


Problem 4 Is the Kerr solution in the exterior of 


the black bole stable? 


The problem remains wide open. 


See also: Asymptotic Structure and Conformal Infinity; 
Classical Groups and Homogeneous Spaces; Critical 
Phenomena in Gravitational Collapse; Einstein 
Equations: Exact Solutions; Geometric Analysis and 
General Relativity; Supergravity. 
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Introduction 


The long-term stability of planets and satellites might 
be desumed by the regular dynamics that we 
constantly observe. However, the ultimate fate of 
the solar system is an intriguing question, which has 
puzzled scientists since antiquity. In the past cen- 
turies, the common belief of a regular motion of the 
main planets was strengthened by the discovery of a 
simple law, due to J D Titius and J E Bode (eight- 
eenth century), which provides a recipe to compute 
the approximate distances of the planets from the 
Sun. Adopting astronomical units as a measure of the 
distance, the Titius-Bode law can be stated as 


d, = 0.4 + 0.3 x 2" AU [1] 


where the index n must be selected as provided in 
Table 1, which compares the distances computed 
according to [1] with the observed values. Titius and 
Bode already noticed that it was necessary to skip 
one unit in z from Mars to Jupiter; indeed, the 
quantity d3 — 2.8 AU might correspond to an aver- 
age distance of some minor bodies of the asteroid 
belt, which had been discovered since the beginning 
of the nineteenth century. The studies of the N-body 
problem, namely the dynamics of N mutually 
attracting bodies (according to Newton’s law), 
inspired several mathematical and physical theories: 
from the development of perturbation methods to 
the discovery of chaotic systems, as attested by the 
masterly work of H Poincaré (1892). In particular, 
perturbation theory had relevant applications in 
celestial mechanics; for example, it led to the 
prediction of the existence of Neptune in the 
nineteenth century by J C Adams and U Leverrier 


Table 1 Tititus-Bode law and observed data 


Index n Distance computed Observed 
(of [1]) from [1] distance (AU) 

Mercury —00 0.4 0.39 

Venus 0 0.7 0.72 

Earth 1 1 1 

Mars 2 1.6 1.52 

Jupiter 4 5.2 5.2 

Saturn 5 10 9.54 

Uranus 6 19.6 19.19 


and later to the discovery of Pluto by C Tombaugh, 
as a result of unexplained perturbations on Uranus 
and Neptune, respectively. Modern advances in 
perturbation theories have been provided by the 
Kolgomorov-Arnol'd-Moser (KAM) and Nekhor- 
oshev theorems, which find broad applications in 
celestial mechanics insofar as simple model pro- 
blems are concerned. 

The stability of the solar system can also be 
approached through numerical investigations, which 
allow one to predict the motion of the celestial 
bodies using more realistic models. The results of 
the numerical integrations undermine in some cases 
the apparent regularity of the solar system: in the 
following sections, we shall review many examples 
of regular and chaotic motions in different contexts 
of celestial mechanics, from the N-body problem to 
the rotational dynamics. 


The Restricted Three-Body Problem 


Let P4,..., Pu be N bodies with masses 7,...,71N, 
which interact through Newton's law. Let ul” € 
R?,i—1,2,..., N, denote the position of the bodies 
in an inertial reference frame. Normalizing the 
gravitational constant to 1, the equations of motion 
of the N-body problem have the form 
2 VG) N Feit) — ald) 
J ee, i=1,...,N [2] 
ELA Ju C) = uU) | 


In the case N=2, one reduces to the two-body 
problem, which can be explicitly solved by means of 
Kepler’s laws as follows. Consider, for example, the 
Earth-Sun case: for negative values of the energy, 
the trajectory of the Earth is an ellipse with one 
focus coinciding with the barycenter, which can 
practically be identified with the Sun; the Earth-Sun 
radius vector describes equal areas in equal times; 
the cube of the semimajor axis is proportional to the 
square of the period of revolution. 

Consider now an extension to the study of three 
bodies such that in the Keplerian approximation P; 
and P4 move around P, and such that the 
semimajor axis of P5 is greater than that of P; (an 
example is obtained identifying Pı with the Sun, P; 
with the Jupiter, and Pz with an asteroid of the 
main belt). The three-body problem is described by 
[2] setting N —3; a special case is given by the 
restricted three-body problem, which describes the 
evolution of a *zero-mass" body under the gravita- 
tional attraction exerted by an assigned two-body 
system. Setting N=3 and m3=0 in [2], the 


equations governing the restricted three-body pro- 
blem are given by 


d^u mau) —ul?)) 
d? y(t) — uP)? 
diu?) | mi(u? — u) 
d? ud — uj 
dłu) ma =u!) my(u9 — u”) 


d?  |3.u0P? WË — ya? 
| | | | 


The first two equations concern the motion of the 
primaries P; and P; and they correspond to a 
Keplerian two-body problem, whose solution can 
be inserted in the equation for 4'?), which becomes a 
periodically forced second-order equation. The 
restricted three-body problem can be conveniently 
described in terms of suitable action-angle coordi- 
nates, known as Delaunay variables. The present 
discussion is restricted to the planar case, namely we 
assume that the motion of the three bodies takes 
place on the same plane. The corresponding Delau- 
nay variables, say (L, G,£,^) € R? x T?, are defined 
as follows (Szebehely 1967). Let a and e be, 
respectively, the semimajor axis and the eccentricity 
of the osculating orbit of P3 and let p= 1 Jm?! 3; then 
Delaunay’s action variables are given by 


L = uyma, G = LV1 — æ? 


Next, introduce the angle variables: we denote by A 
and y the longitudes of Jupiter and of the asteroid; 
let be the argument of perihelion, namely the angle 
formed by the periapsis direction with a preassigned 
reference line, and let mu denote the eccentric 
anomaly, which can be defined through 


-— y l+e u 

= tan— 3 
2 lg B 

Let / be the mean anomaly, which is related to the 

eccentric anomaly by means of Kepler's equation 


tan 


(=u-—esinu [4] 


Delaunay’s angle variables are represented by the 
mean anomaly / and by the argument of perihelion 
y. For completeness, it should be remarked that 
the distance r between the minor body P3 and the 
primary P, is related to the longitude and to the 
eccentric anomaly by means of the relations 


a(l — e) 


"ied 4 eve) [5] 


In a reference frame centered at one of the 
primaries, say Pj, let H—H(L,G,£,»,A) denote 
the Hamiltonian function describing the planar 


Stability Problems in Celestial Mechanics 21 


problem; notice that H(L, G, £, y, A) has two degrees 
of freedom and an explicit time dependence through 
the longitude A of P2. If the primaries are assumed to 
move in circular orbits around their common center 
of mass, the Hamiltonian function reduces to two 
degrees of freedom, where a new variable g is 
introduced as the difference between the argument 
of perihelion y and the longitude A of the primary. 
Normalizing the units of measure so that the 
distance between the primaries and the sum of 
their masses is unity, the Hamiltonian function H 
describing the circular, planar, restricted three-body 
problem is given by 


HL, G,£,£) = TE 


where £= yum. The perturbing function takes the 
form 


1 


F = rcos(f +8) — — 2rcos(f + g) 


1+r 


where f = i? — y represents the true anomaly, namely 
the angle formed by the instantaneous orbital radius 
with the periapsis line. Notice that the quantities 7 
and f are functions of the Delaunay variables 
through the relations [3]-[5]. As a consequence, 
one can expand the perturbing function in the form 
(Delaunay 1860) 


FUL, G,£,g) = 


X Fall, g)e 


j k20 


where F; are cosine terms with arguments given by 
a linear combination of the variables / and g. For 
example, the first few terms of the series develop- 
ment are given by the following expression: 


- EC DEN M SE, 
F(L,G,f,g) 2 —1 7 Ear + 5 cos f 
3 15 
— E ES 2-12) cos(£ 4- g) 
od 
+—L"e cos(£+ 2g) 


4 
a4, 9 2 
- ($1 Hip! ) cos(2e + 29) 
- ¿Le cos(31 + 2g) 


- (si^ +51) cos(3£ + 3g) 


35 1g 
-74L cos(4é + 4g) 


aga cos(5@+ 5g) 4---- [7] 
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where the eccentricity is a function of the actions 
through e= /1— G?/L?. We remark that the 
Hamiltonian [6] is nearly integrable with perturbing 
parameter e; indeed, for € — 0 one recovers the two- 
body problem describing the interaction between P] 
and P5, which can be explicitly solved according to 
Kepler's laws. 


KAM Stability 


Classical perturbation theory, as developed by 
Laplace, Lagrange, Delaunay, Poincaré, etc., does 
not allow investigation of the stability of the N-body 
problem, since the series defining the solution are 
generally divergent. In order to justify this state- 
ment, let us start by rewriting the unperturbed 
Hamiltonian in [6] as 
1 

ML G)= JE G [8] 
so that [6] becomes H(L,G,4,g)— h(L,G)+ 
eF(L, G,£,g). In order to remove the perturbation 
to the second order in the perturbing parameter, one 
looks for a change of variables (L,G,4£,g) — 
(L', G', @, g') close to the identity, that is, 

OP 


bed — (L', G', £, 
tex ? ? 74 


p 
G=G +25 (1,68) 


OD 
em 
/ =b+ 657 


oğ 
"es Lu. 

8 =§ tena ( 8) 
where ®(L’, G’, £, g) is the generating function of the 
transformation. Let 

Oh 1 

— (L =— = w(L 
In order to perform a first-order perturbation 
theory, we look for a generating function 
$(L', G',£,g), such that the transformed Hamilto- 
nian is integrable up to O(e*), namely 


oo 
Ot 


(E G4 e) 


b (1 + e—(L',G’,,g),G' + a E, t.8)) 
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where /;(L’, G') is the new unperturbed Hamilto- 
nian. If we denote by Fo(L', G') the average of the 
perturbing function over the angle variables, the 
new unperturbed Hamiltonian takes the form 


bill, G^) — bL, G^) 4- eFo(1^, G^) 


Expanding F in Fourier series as F(L, G,£, g) — 
Yn mez Enm(L, G)e "^ *"9. the generating function 
is given by the following expression: 


Poul LG") 


eint mg) 
w(L')n — m 


ei^ G' £s) — —i 
n, mEZ\{0} 


The occurrence of small divisors of the form 


1 

WD n,meZ 

might prevent the convergence of the series defining 
the generating function. In particular, we remark 
that zero divisors occur whenever u(L)=m/n. This 
situation, which is called an m:n orbit—orbit 
resonance, implies that during a given interval of 
time the body P3 makes m revolutions, whereas P; 
makes exactly n orbits about P;. 

The control of the occurrence of the small divisors 
was obtained through a theorem by A N Kolmogorov, 
who made a major breakthrough in the study 
of nearly integrable systems. He proved, under 
general assumptions, that some regions of the 
phase space are almost filled by maximal invariant 
tori. The theorem provides a constructive algorithm 
to give estimates on the perturbing parameter, 
ensuring the existence of some invariant surfaces. 
Kolmogorov's theorem was later extended by 
V I Arnol'd and J Moser, giving rise to the so-called 
KAM theory. More precisely, the KAM theorem 
can be stated as follows (see, e.g., Arnol'd et al. 
(1997)) consider a real-analytic, nearly integrable 
Hamiltonian function and fix a rationally indepen- 
dent frequency vector w; if the unperturbed 
Hamiltonian is not degenerate and if the frequency 
satisfies a strong nonresonance assumption (called 
the diophantine condition), for sufficiently small 
values of the perturbing parameter, there exists an 
invariant torus on which a quasiperiodic motion 
with frequency w takes place. A preliminary 
investigation of the stability of the N-body problem 
by means of KAM theory (Arnol'd et al. 1997) 
leads to the existence of large regions filled by 
quasiperiodic motions, provided the masses of the 
planets are sufficiently small. Arnol'd's version of 
KAM theorem has been applied by J Laskar and P 
Robutel to the spatial three-body planetary problem 
(the planetary problem concerns the study of the 


dynamics of two bodies with comparable masses, 
moving in the gravitational field of a larger primary) 
and the existence of quasiperiodic motions has been 
proved for values of the ratio of semimajor axis less 
than 0.8 and for inclinations up to ~ 1°. 

Concrete estimates on the strength of the perturba- 
tion were given by M Hénon: in the context of the 
three-body problem, the application of the original 
version of Arnol’d’s theorem allows one to prove the 
existence of invariant tori for values of the perturbing 
parameter (representing the Jupiter-Sun mass ratio) 
«10 ??? while the implementation of Moser's theo- 
rem provides an estimate of 10%, We remark that the 
astronomical value of the Jupiter-Sun mass ratio 
amounts to ~ 10^, showing a relevant discrepancy 
between KAM results and physical measurements. 
More recently, KAM estimates have been refined and 
adapted to the study of significant problems of celestial 
mechanics (Celletti and Chierchia 1995). Strong 
improvements have been obtained combining accurate 
estimates with a computer-assisted implementation, 
where the computer is used to perform long computa- 
tions concerning the development of the perturbing 
series and the check of KAM estimates. The numerical 
errors are controlled through the implementation of a 
suitable technique, known as interval arithmetic. In 
the framework of the planar, circular, restricted three- 
body problem, the stability of some asteroids has been 
proved by A Celletti and L Chierchia for realistic 
values of the perturbing parameter (e.g., for e = 10). 
A suitable approximation of the disturbing function 
(namely, a finite truncation of the series development 
as in [7]) has been considered. The result relies on an 
implementation of a computer-assisted isoenergetic 
KAM theorem and on the following remark: in the 
four-dimensional phase space, on a fixed energy level 
the invariant two-dimensional surfaces separate the 
phase space, providing the stability of the actions for 
all motions trapped between any two invariant tori. 
Since the action variables are related to the semimajor 
axis and to the eccentricity of the orbit, one obtains 
that the elliptic elements remain close to their initial 
values. 

A computer-assisted KAM theorem has been 
applied by A Giorgilli and U Locatelli to the 
planetary (Jupiter-Saturn) problem. Using a suitable 
secular approximation, it can be shown that this 
model admits two invariant tori, which bound the 
orbits corresponding to the initial data of Jupiter 
and Saturn. 


Nekhoroshev Stability 


A different approach in order to study the stability 
of nearly integrable systems is provided by 
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Nekhoroshev's theorem (see, e.g., Arnol'd et al. 
(1997)), which guarantees, under smallness require- 
ments, the stability of the motions on an open set of 
initial conditions for exponentially long times. 
Consider a Hamiltonian function of the form 


H(y,x) =h(y)+ef(y,x), (ux)eBxT" [9] 


where B is an open subset of R”. We assume that ^ 
and f are analytic functions and that the integrable 
Hamiltonian P satisfies a geometric condition, called 
steepness. We remark that functions such as h(L, G) 
in [8] satisfy the steepness condition. For sufficiently 
small values of e, Nekhoroshev’s theorem states that 
any motion (y(t),x(t)) satisfying Hamilton’s equa- 
tions associated with [9] is bounded for a finite (but 
exponentially long) time, that is, 


Iy) — y(0)|| € yoe”, for |t] < toed" 


where yo,to,€0,4, and b are suitable positive 
constants. 

Nekhoroshev’s theorem can be conveniently 
applied to the three-body problem, where it provides 
a confinement of the action variables, representing 
the semimajor axis and the eccentricity of the 
osculating orbit. Interesting applications of 
Nekhoroshev’s theorem concern the investigation 
of the triangular Lagrangian points in the spatial, 
restricted three-body problem. (The Lagrangian 
points are five equilibrium positions of the planar, 
restricted three-body problem in a synodic reference 
frame, which rotates with the angular velocity of the 
primaries. Two of such positions are called trian- 
gular, since the configuration of the three bodies is 
an equilateral triangle in the orbital plane.) Effective 
estimates were developed by A Giorgilli and 
C Skokos, showing the existence of a stability 
region around the Lagrangian point L4, large 
enough to include some known asteroids. In the 
same framework, the exponential stability was 
proven by G Benettin, F Fassó, and M Guzzo for 
all values of the mass-ratio parameter, except for a 
few values of the reduced mass y up to y œ 0.038. 


Numerical Results 


The study of the stability of the N-body problem can 
be investigated by performing numerical integrations 
of the equations of motion. The dynamics of the 
outer planets of the solar system (from Jupiter to 
Pluto) has been explored by Sussman and Wisdom 
(1992) using a dedicated computer, the Digital 
Orrery. The integration of the equations of motion 
was performed over 845 million years; the results 
provided evidence of the stability of the major 
planets and a chaotic behavior of Pluto. An 
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alternative approach, based on an average of the 
equations of motion over fast angles, was adopted 
by Laskar (1995), where the perturbing function of 
the spatial problem was expanded up to the second 
order in the masses and up to the fifth powers of the 
eccentricity and the inclination. The dynamics of all 
planets (excluding Pluto) was investigated by means 
of frequency analysis over a time span ranging from 
—15 Gyr to +10Gyr. The numerical integrations 
provided evidence of the regularity of the external 
planets (from Jupiter to Neptune), a moderate 
chaotic behavior of Venus and the Earth, and a 
marked chaotic dynamics of Mercury and Mars. 
The computations show that the inner solar system 
is chaotic, with a Lyapunov time of ^ 5 Myr, thus 
preventing any prediction of the evolution over 
100 Myr. 


The Spin-Orbit Problem 


The dynamics of the bodies of the solar system 
results from a combination of a revolutionary 
motion around a primary body and a rotation 
about an internal axis. A simple mathematical 
model describing the spin-orbit interaction can 
be introduced as follows. Let S be a triaxial 
ellipsoidal satellite, which moves about a central 
planet P. We denote by T;., and T; the periods of 
revolution and rotation. A p:q spin-orbit reso- 
nance occurs if 


lus P 
==. forp.acN, 0 
P." p.q q# 


Whenever p=q=1, the satellite always points the 
same face to the host planet. Most of the evolved 
satellites or planets are trapped in a 1:1 resonance, 
with the only exception of Mercury, which is 
observed in a nearly 3:2 resonance. In order to 
introduce a simple mathematical model which 
describes the spin-orbit interaction, we assume that: 


1. the satellite moves on a Keplerian orbit around the 
planet (with semimajor axis a and eccentricity e); 

2. the spin axis is perpendicular to the orbit plane; 

3. the spin axis coincides with the shortest physical 
axis; and 

4. dissipative effects as well as perturbations due to 
other planets or satellites are neglected. 


We denote by A < B < C the principal moments 
of inertia of the satellite and by r and f, respectively, 
the instantaneous orbital radius and the true 
anomaly of the Keplerian orbit. Let x be the angle 
between the longest axis of the ellipsoid and a 
preassigned reference line. From standard Euler’s 


equations for rigid body, the equation of motion in 
normalized units (i.e., assuming that the period of 
revolution is 27) takes the form 


iv sin2x — 2f) =0 (10) 


where ¿ = ł(B — A)/C. This equation is integrable 
whenever A=B or in the case of zero orbital 
eccentricity. Due to the assumption of Keplerian 
motion, both r and f are known functions of the 
time. Therefore, we can expand [10] in Fourier 
series as 


E+E W(5.e) sin(2x —mt)=0 [11] 


m0, m=-00 


where the coefficients W(m/2,e) decay as 
W(m/2,e) ce”, A further simplification of the 
model is obtained as follows. According to (4), we 
neglected the dissipative forces and perturbations 
due to other bodies. The most important contribu- 
tion is due to the nonrigidity of the satellite, 
provoking a tidal torque caused by the internal 
friction. The size of the dissipative effects is 
significantly small compared to the gravitational 
terms. Therefore, we decide to retain in [11] only 
those terms which are of the same order or larger 
than the average effect of the tidal torque. The 
following equation results: 


N3 " 
te We) sin(2x —mt)=0 [12] 
m-.m-Nij 


where N; and N3 are suitable integers, which depend 
on the physical and orbital parameters of the satellite, 
while W(m/2, e) are suitable truncations of the 
coefficients W(m/2, e). We remark that eqn [12] can 
be derived from Hamilton's equations associated 
with a one-dimensional, time-dependent, nearly 
integrable Hamiltonian function with perturbing 
parameter £ and a trigonometric disturbing function. 


Analytical Results 


The phase space associated with [12] admits a 
Poincaré map showing a pendulum-like structure: 
the periodic orbits are surrounded by librational 
curves and the chaotic separatrix divides the libra- 
tional regime from the region where rotational 
motions can take place. The three-dimensional 
phase space is separated by KAM rotational tori 
into invariant regions, providing a strong stability 
property for all motions confined between any pair 
of KAM rotational tori. Let us denote by P(p/q) a 
periodic orbit associated with the p:4 resonance; in 
the context of the model associated with [12], the 


stability of the periodic orbit P(p/q) is obtained by 
showing the existence of two invariant tori 
T(w1) and T(w2) with wı < p/q <u». A refined 
computer-assisted KAM theorem has been imple- 
mented (Celletti 1990) with the aim of proving the 
existence of trapping invariant surfaces. Realistic 
estimates, in agreement with the physical values of 
the parameters (namely, the equatorial oblateness & 
and the eccentricity e), have been obtained in several 
examples of spin-orbit commensurabilities, like the 
1:1 Moon-Earth interaction or the 3:2 Mercury- 
Sun resonance. 

Concerning Nekhoroshev-type estimates, the 
classical D’Alembert problem has been studied by 
Biasco and Chierchia (2002). In particular, an 
equatorially symmetric oblate planet moving on a 
Keplerian orbit around a primary body has been 
investigated; the model does not assume any further 
constraint on the spin axis. Although the Hamilto- 
nian describing this model is properly degenerate, it 
is shown that Nekhoroshev-like results apply to the 
D'Alembert problem in the proximity of a 1:1 
resonance. 


Numerical Results 


The model introduced in [10]-[12] often represents an 
unrealistic simplification of the spin-orbit dynamics. 
In particular, assumption (1) implies that secular 
perturbations of the orbital parameters are neglected, 
whereas the hypothesis (2) corresponds to disregarding 
the spin-orbit obliquity, namely the angle formed by 
the rotational axis with the normal to the orbital 
plane. Due to the presence of an equatorial bulge, the 
gravitational attraction of the other bodies of the solar 
system induces a torque, resulting in a precessional 
motion. It is also important to take into account the 
changes of the obliquity angle, whose variations 
might affect the climatic behavior. 

A realistic model for the precession and the 
variation of the obliquity has been presented by 
Laskar (1995). The numerical simulations and the 
frequency-map analysis show that the Earth's 
obliquity is actually stable, although a large 
chaotic region is found in the interval between 
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60° and 90°. Since the present obliquity of the 
Earth amounts to ~23.3°, the Earth is outside the 
dangerous region. An interesting simulation was 
performed to evaluate the role played by the 
Moon. Without the Moon, the extent of the 
chaotic region would greatly increase, eventually 
preventing the birth of an evoluted life. Among 
the inner planets, Mars’ obliquity shows larger 
chaotic extent, which drives to variations from 
0° to 60° in a few million years. On the contrary, 
the external planets do not show significant 
chaotic regions and their obliquities are essen- 
tially stable. 


See also: Averaging Methods; Dynamical Systems in 
Mathematical Physics: An Illustration from Water Waves; 
Gravitational N-Body Problem (Classical); Hamiltonian 
Systems: Stability and Instability Theory; KAM Theory 
and Celestial Mechanics; Multiscale Approaches; 
Stability Theory and KAM. 
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Introduction 


A Hamiltonian system is a dynamical system whose 
equations of motions can be written in terms of a 
scalar function, called the Hamiltonian of the system: 
if one uses coordinates (p, q) in a domain (phase space) 
D c R^", where N is the number of independent 
variables one needs to identify a configuration of the 
system (degrees of freedom), there is a function H(p, q) 
such that p = — ðH /ðq and g=0H/Op. An integrable 
(Hamiltonian) system is a Hamiltonian system which, 
in suitable coordinates (A, a) € A x T", where Ais an 
open subset of R and T — R/2zZ is the standard 
torus, can be described by a Hamiltonian 7(0(A), 
that is, depending only on A. The coordinates 
(A, œ) are called action-angle variables. In such a 
case the dynamics is trivial: any initial condition 
(Ao, Œo) evolves in such a way that the action 
variables are constants of motion (i.e., A(t) = Ao for 
all t € R), while the angles grow linearly in time as 
QG(t) =0%) +0t, where @=@(Ao) = O4Ho(Ao) is 
called the rotation (or frequency) vector. An 
integrable system can be thought of as a collection 
of decoupled (i.e., independent) rotators: the entire 
phase space Ax T is foliated into invariant tori 
and all motions are quasiperiodic. Integrable 
systems are stable, in the sense that nearby initial 
conditions separate at most linearly in time (in 
particular, the actions do not separate at all): 
mathematically, this is expressed by the fact that 
all the Lyapunov exponents are nonpositive. 

An example of an integrable system is any one- 
dimensional conservative mechanical system, in any 
region of phase space in which motions are 
bounded. By increasing the number of degrees of 
freedom, exhibiting nontrivial integrable systems 
can become a difficult task. The problem of studying 
the effects of even small Hamiltonian perturbations 
on integrable systems and of understanding if the 
latter remain stable, in the aforementioned sense, 
was considered by Poincaré to be the fundamental 
problem of dynamics. For a long time, it was 
commonly thought that all motions could be 
reduced to superpositions of periodic motions, 
hence to quasiperiodic motions, but at the end of 
nineteenth century it was realized by Boltzmann and 
Poincaré that such a picture was too naive, and that 
in reality more complicated motions were possible. 


As a consequence of this, it became a widespread 
belief that, even when starting from an integrable 
system, the introduction of an arbitrarily small 
perturbation would break integrability. 

This belief was strengthened by the work of 
Poincaré (1898), who showed that the series 
describing the solution in a perturbation theory 
approach are in general divergent. The source of 
divergence in perturbation series is the presence of 
small divisors, that is, of denominators of the kind 
of O - V, where @ is the rotation vector that should 
characterize the invariant torus (if existent) and v is 
any integer vector. Despite this, however, perturba- 
tion series (known as Lindstedt series) continued to 
be extensively used by astronomers in problems of 
celestial mechanics, such as the study of planetary 
motions, for the simple reason that they provided 
predictions in good agreement with the observa- 
tions. But the feeling that the underlying mathema- 
tical tools were unsatisfactory persisted. 

In fact, the well-known  Fermi-Pasta-Ulam 
numerical experiment, in 1955, was originally 
conceived in the spirit of confirming that integr- 
ability would in general be easily lost. Consider a 
chain with N harmonic oscillators, with, say, 
periodic boundary conditions, coupled with cubic 
and quartic two-body potentials, so that the 
Hamiltonian is 


N 
1 | 

H(p, q) = 23 + W(qia — i) 

i=l |l 


for a, 3 real parameters and (p,q) € R x R. One 
can introduce new variables such that the Hamilto- 
nian, for œ = 8 —0, can be written as 


N 
Ho(A) = 5-5 (Pi+uQ})=0-A — Qi 


=] 


for a suitable rotation vector @=(w),...,wx) € RÁ 
(an explicit computation gives wą = 2 sin(kz/N)). 
Consider an initial condition in which all the 
energy is confined to a few modes, that is, A, # 0 at 
t —0 only for a few values of k. For a= 8—0, the 
system is integrable, so that A,(t)=0 for all t € R 
and for all k such that A,(0) =0. If the system ceases 
to be integrable when the perturbation is switched 
on, the energy is likely to start to be shared among 
the various modes, and after a long enough time has 


elapsed, an equidistribution of the energy among all 
modes (thermalization) might be expected. At least 
this behavior was expected by Fermi, Pasta, and 
Ulam, but it was not what they found numerically: 
on the contrary, all the energy seemed to remain 
associated with the modes close to the few initially 
excited ones. 

At about the same time, Kolmogorov (1954) 
published a breakthrough paper going exactly in 
the opposite direction: if one perturbs an integrable 
system, under some mild conditions on the integr- 
able part, most of the tori are preserved, although 
slightly deformed. A more precise statement is the 
following. 


Theorem 1 Let an N-degree-of-freedom Hamilto- 
nian system be described by an analytic Hamiltonian 
of the form 


H(A, æ) = Ho(A) + ef (A, æ) [3] 


with e a real parameter (perturbation parameter), 
f a 2n-periodic function of each angle variable 
(potential or perturbation), and Ho(A) satisfying 
the  nondegeneracy condition det 02710(A) # 0 
(anisochrony condition). If @=@(A) = O4Ho(A) is 
fixed to satisfy the Diophantine condition 


C 
læ- v| > WweZ\0 [4] 


LAM 


for some constants Cy >0 and T »N-—1 (bere 
lv] — a 4 |vN| and - denotes tbe standard 
inner product: @-V=w il, +: c wNvN), then 
there is an invariant torus with rotation vector @ 
for e small enough, say for e smaller than some value 
ey depending on Cy and v (and on the function f ). 


By saying that there is an invariant torus with 
rotation vector @, one means that there is an 
invariant surface in phase space on which, in 
suitable coordinates, the dynamics is the same as in 
the unperturbed case, and the conjugation (i.e., the 
change of variables which leads to such coordinates) 
is analytic in the angle variables and in the 
perturbation parameter. One also says that the 
torus of an integrable system (c — 0) is preserved 
(or even persists) under a small perturbation. 

Note that, a posteriori, this proves convergence of 
the perturbation series: however, a direct check of 
convergence was performed only recently by 
Eliasson (1996). Kolmogorov's proof was based on 
a completely different idea, that is, by performing 
iteratively a sequence of canonical transformations 
(which are changes of coordinates preserving the 
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Hamiltonian structure of the equations of motion) 
such that at each step the size of the perturbation is 
reduced. Of course, on the basis of Poincaré's result, 
this iterative procedure cannot work for all initial 
conditions (e.g., when @ does not satisfy [4]). The 
key point in Kolmogorov's scheme is to fix the 
rotation vector @ of the torus one is looking for, in 
such a way that the small divisors are controlled 
through the Diophantine condition [4] and the 
exponentially fast convergence of the algorithm. 

New proofs and extensions of Kolmogorov's 
theorem were given later by Arnol'd (1962) and by 
Moser (1962); hence, the acronym KAM to denote 
such a theorem. Arnol'd gave a more detailed (and 
slightly different) proof compared to the original 
one by Kolmogorov, and applied the result to the 
planar three-body problem, thus showing that 
physical applications of the theorem were possible. 
Moser, on the other hand, proposed a modified 
method using a technique introduced by Nash 
(which approximates smooth functions with analy- 
tical ones) to deal with the case of systems with 
finite smoothness. 

For fixed small enough e, the surviving invariant 
tori cover a large portion of the phase space, called 
the Kolmogorov set; the relative measure of the 
region of phase space which is not filled by such tori 
tends to zero at least as y£ for € — 0. A system 
described by a Hamiltonian like [3] is then called a 
quasi-integrable Hamiltonian system. 

The excluded region of phase space corresponds 
to the unperturbed tori which are destroyed by the 
perturbation: the rotation vectors of such tori are 
close to a resonance, that is, to a value @ such that 
(-V—0 for some integer vector V, and these are 
exactly the vectors which do not satisfy the 
Diophantine condition [4] for any value Cp. A 
subset of phase space of this kind is called a 
resonance region. 

At first sight, this would seem to provide an 
explanation for the results found by Fermi, Pasta, 
and Ulam, but this is not quite the case. First, the 
threshold value £y depends on N, and goes to zero 
very fast as N — oo (in general as N! ^ for some 
a > 0); however, the results of the numerical 
experiments apparently were insensitive to the 
number N of oscillators. Second, the KAM theorem 
deals with maximal tori, that is, tori characterized 
by rotation vectors which have as many components 
as the number of degrees of freedom, while the 
rotation vectors of the numerical quasiperiodic 
solutions seem to involve just a small number of 
components. 

Finally, as an extra problem, the validity of the 
nondegeneracy condition for the  unperturbed 
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Hamiltonian is violated, because the unperturbed 
Hamiltonian is linear in the action variables (one 
says that the Hamiltonian is isochronous). Recently, 
Rink (2001), by continuing the work by Nishida, 
showed that in the Fermi-Pasta-Ulam problem it is 
possible to perform a canonical change of coordi- 
nates such that in the new variables the Hamiltonian 
becomes anisochronous: one uses part of the 
perturbation to remove isochrony. But the other 
two obstacles remain. 


Lower-Dimensional Tori 


A natural question is what happens to the invariant 
tori corresponding to rotation vectors which are not 
rationally independent, that is, vectors satisfying n 
resonance conditions, such as @-v;=0 for m 
independent vectors Vi, ...,V,, with 1<n<N-—2 
(the case 7 — N — 1 corresponds to periodic orbits 
and is comparatively easy); for instance, one can 
take 9 —(w1,...,054,0,...,0) and, by a suitable 
linear change of coordinates, one can always make 
the reduction to a case of this kind. In particular, 
one can ask if a result analogous to the KAM 
theorem holds for these tori. Such a problem for the 
model [3] has not been studied very widely in the 
literature. What has usually been considered is a 
system of z rotators coupled with a system with 
s—N —n degrees of freedom near an equilibrium 
point: then one calls normal coordinates the 
coordinates describing the latter, and the role of 
the parameter & is played by the size of the normal 
coordinates (if their initial conditions are chosen 
near the equilibrium point). In the absence of 
perturbation (i.e., for € — 0), one has either hyper- 
bolic or elliptic or, more generally, mixed tori, 
according to the nature of the equilibrium points: 
one refers to these tori as lower-dimensional tori, as 
they represent n-dimensional invariant surfaces in a 
system with N degrees of freedom. Then one can 
study the preservation of such tori. 

One can prove that, in such a case, at least if 
certain generic conditions are satisfied, in suitable 
coordinates, n angles rotate with frequencies 
W1,---,W,, respectively, while the remaining N — n 
angles have to be fixed close to some values 
corresponding to the extremal points of the function 
obtained by averaging the potential over the rotating 
angles. 

The case of hyperbolic tori is easier, as in the case 
of elliptic tori one has to exclude some values of € to 
avoid some further resonance conditions between 
the rotation vector @ and the normal frequencies A, 
(i.e. the eigenvalues of the linearized system 


corresponding to the normal coordinates), known 
as the first and second Mel'nikov conditions: 
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Such conditions appear, with the values of the 
normal frequencies slightly modified by terms 
depending on e, at each iterative step, and at the 
end only for values of e belonging to some Cantor 
set one can have elliptic lower-dimensional tori. 

The second Mel'nikov conditions are not really 
necessary, and in fact they can be relaxed as Bourgain 
(1994) has shown; this is an important fact, as it 
allows degenerate normal frequencies, which were 
forbidden in the previous works by Kuksin (1987), 
Eliasson (1988), and Póschel (1989). 

Similar results also apply in the case of lower- 
dimensional tori for the model [3], which represents 
sort of a degenerate situation, as the normal 
frequencies vanish for ¿=0. Again, one has to use 
part of the perturbation to remove the complete 
degeneracy of normal frequencies. 


Quasiperiodic Solutions in Partial 
Differential Equations 


For explaining the Fermi-Pasta-Ulam experiment, 
one has to deal with systems with arbitrarily many 
degrees of freedom. Hence, it is natural to investigate 
systems which have ab initio infinitely many 
degrees of freedom, such as the nonlinear wave 
equation, Uy — Ux, + V(x)u=y(u), the nonlinear 
Schrödinger equation, iu, — Uxx + V(x)u — p(u), the 
nonlinear Korteweg-de Vries equation 4; + Uxxx — 
6u,u = y(u), and other systems of nonlinear partial 
differential equations (PDEs); the continuum limit of 
the Fermi-Pasta-Ulam model gives indeed a non- 
linear Korteweg-de Vries equation, as shown by 
Zabuski and Kruskal (1965). Here (t, x) € Rx [0, rlé, 
if d is the space dimension, and either periodic 
(u(0,t)=u(m,t)) or Dirichlet (2(0,2) — u(z, t) — 0) 
boundary conditions can be considered; (u) is a 
function analytic in u and starting from orders strictly 
higher than one, while V(x) is an analytic function of 
x, depending on extra parameters £),...,&,. Such a 
function is introduced essentially for technical rea- 
sons, as we shall see that the eigenvalues A, of the 
Sturm-Liouville operator —02 + V(x) must satisfy 
some Diophantine conditions. If we set V(x) -u € R 
in the nonlinear wave equation, we obtain the Klein- 
Gordon equation, which, in the particular case p = 0, 


reduces to the string equation. Again, the role of the 
perturbation parameter is played by the size of the 
solution itself. 

Small-amplitude periodic and quasiperiodic 
solutions for PDE systems have been extensively 
studied, among others, by Kuksin, Wayne, Craig, 
Póschel, and Bourgain. Results for such systems read 
as follows. Consider for concreteness the one-dimen- 
sional nonlinear wave equation with Dirichlet bound- 
ary conditions and with y(u) =u? + O(u?). When the 
nonlinear function y(u) is absent, any solution of the 
linear wave equation ty — uxx + V(x)u — 0 is a super- 
position of either finitely or infinitely many periodic 
solutions with frequencies A, determined by the 
function V(x). Let uo(@t,x) be a quasiperiodic 
solution of the linear wave equation with rotation 
vector @ € R”, where wk —A,,, for some n-tuple 
(mi,...,7:,). Then for e small enough there exists a 
subset =- of the space of parameters with large 
Lebesgue measure (more precisely, with complemen- 
tary Lebesgue measure which tends to zero when 
€ — 0) such that for all £ = (£1,...,£,) € E- there is a 
solution u-(t, x) of the nonlinear wave equation and a 
rotation vector (9. satisfying the conditions 


Iu. (t, x) — v'euo(9.t, x)| € Ce 
lo- — @| < Ce [6] 


for some positive constant C. 

The case n= 1 (periodic solutions) is not as easy 
as the finite-dimensional case, because there are 
infinitely many normal frequencies, so that there are 
small divisor problems which for finite-dimensional 
systems appear only for n > 2. 

For the nonlinear wave equation and the 
Schrödinger equation, if n>1, one can take 
V(x)=u, but one needs u Z 0; for n > 1, one can 
take V(x)=p, as one can perform a preliminary 
transformation leading to an equation in which a 
function depending on parameters naturally 
appears, as shown by Kuksin and Póschel (1996). 
For n=1, the case y=0 has been very recently 
solved by Gentile et al. (2005). 

Statements for more general situations can also 
be obtained, while extensions to space dimensions 
d >2 are not trivial and have been obtained only 
recently by Bourgain (1998). The above result also 
holds if the number of components of the rotation 
vector is less than the number of parameters: one 
uses such parameters because one needs to impose 
some Diophantine conditions such as [5], now for 
all the frequencies A, =w,,k ¢ (m1,...,71,]. Again, 
the second Mel'nikov conditions were shown by 
Bourgain to be unnecessary, and this is an essential 
ingredient for the higher-dimensional case. 
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Even if systems of the type considered above have 
been widely studied, they remain significantly 
different from a discrete system such as the chain 
of oscillators [1] for N large enough (also in the 
limit N — oo), so that the results which have been 
found for PDE systems do not really provide an 
explanation for the numerical findings. 

Also in the case of lower-dimensional tori for finite- 
dimensional systems the main problem is that, even if 
such tori exist, it is not clear what relevance they can 
have for the dynamics (a case in which hyperbolic tori 
play a role is considered later). An important feature of 
maximal tori is that they fill most of the phase space, a 
property which certainly does not hold for lower- 
dimensional tori, which lie outside the Kolmogorov set. 

In the Fermi-Pasta-Ulam experiment, one con- 
siders initial conditions close to lower-dimensional 
tori; hence, an interesting problem is to study their 
stability, that is, how fast the trajectories starting 
from such initial conditions drift away. 


Arnol’d Diffusion and Nekhoroshev’s 
Theorem 


Consider again the maximal tori. For N=2, the 
preservation of most of the invariant tori prevents the 
possibility of diffusion in phase space: the tori 
represent two-dimensional surfaces in a three-dimen- 
sional space (as dynamics occur on the level surfaces 
of the energy in a four-dimensional space), so that, if 
an initial condition is trapped in a gap between two 
tori, the corresponding trajectory remains confined 
forever between them. The situation is quite different 
for N > 3: in such a case, the tori do not represent a 
topological obstruction to diffusion any more. 

That mechanisms of diffusion are really possible 
was shown by Arnol’d (1963). Because of the 
perturbation, lower-dimensional hyperbolic tori 
appear inside the resonance regions, with their 
stable and unstable manifolds (whiskers). It is 
possible that these manifolds of the same torus 
intersect with a nonvanishing angle (homoclinic 
angle); as a consequence, the angles between the 
stable and unstable manifolds of nearby tori 
(heteroclinic angles) can also be different from 
zero, and one can find a set of hyperbolic lower- 
dimensional tori such that the unstable manifold of 
each of them intersects the stable manifold of the 
torus next to it: one says that such tori form a 
transition chain of heteroclinic connections. Then 
there can be trajectories moving along such connec- 
tions, producing at the end a drift of order 1 (in £) in 
the action variables. Such a phenomenon is referred 
to as Arnol'd diffusion. 
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Of course, diffusing trajectories should be located 
in the region of phase space where there are no 
invariant tori (hence, a very small region when e is 
small), but an important consequence is that, unlike 
what happens in the unperturbed case, not all 
motions are stable: in particular, the action variables 
can change by a large amount over long times. 

Providing interesting examples of Hamiltonian 
systems in which Arnol’d diffusion can occur is not 
so easy: in fact, for the diffusion to really occur, one 
needs a lower bound on the homoclinic angles, and 
to evaluate these angles can be difficult. For 
instance, Arnold’s (1963) original example, which 
describes a system near a resonance region, is a two- 
parameter system given by 


1 (A1 + A3) + As + pu(cos ay — 1) 
+ eu(coso, — 1)(sin o? + cos a3) [7] 


and the angles can be proved to be bounded from 
below only by assuming that the perturbation para- 
meter £ is exponentially small with respect to the other 
parameter jz, which in turn implies a situation not 
really convincing from a physical point of view. More 
generally, for all the examples which are discussed in 
literature, the relation with physics (as the d'Alembert 
problem on the possibility for a planet to change the 
inclination of the precession cone) is not obvious. 

So the question naturally arises as to how fast can 
such a mechanism of diffusion be, and how relevant 
is it for practical purposes. A first answer is 
provided by a theorem of Nekhoroshev (1977), 
which states the following result. 


Theorem 2 Suppose we have an N-degree-of- 
freedom quasi-integrable Hamiltonian system, 
where the unperturbed Hamiltonian satisfies some 
condition such as convexity (or a weaker one, 
known as steepness, which is rather involved, to 
state in a concise way); for concreteness consider a 
function Ho(A) in [2] which is quadratic in A. Then 
there are two positive constants a and b such that 
for times t up to O(exp(e~)) the variations of the 
action variables cannot be larger than O(e^). 


The constants a and b depend on N, and they tend 
to zero when N — oc; Lochak and Neishtadt (1992) 
and Póschel (1993) found estimates a — b — 1/2N, 
which are probably in general optimal. Nekhor- 
oshev's theorem is usually stated in the form above, 
but it provides more information than that explicitly 
written: the trajectories, when trapped into a 
resonance region, drift away and come close to 
some invariant torus, and then they behave like 
quasiperiodic motions, up to very small corrections, 
for a long time, until they enter some other 


resonance region, and so on. Of course, for initial 
conditions on some invariant torus, KAM theorem 
applies, but the new result concerns initial condi- 
tions which do not belong to any tori. 

Nekhoroshev's theorem gives a lower bound for 
the diffusion time, that is, the time required for a 
drift of order 1 to occur in the action variables. But, 
of course, an upper bound would also be desirable. 
The diffusion times are related to the amplitude of 
the homoclinic angles, which are very small (and 
difficult to estimate as stated before). The strongest 
results in this direction have been obtained with 
variational methods, for instance, by Bessi, Bernard, 
Berti, and Bolle: at best, for the diffusion time, one 
finds an estimate O(p* logj/!), if y is the ampli- 
tude of the homoclinic angles (which in turn are 
exponentially small in some power of e, as one can 
expect as a consequence of Nekhoroshev's theorem). 

Then one can imagine that the results of the Fermi- 
Pasta-Ulam experiment can also be interpreted in the 
light of Nekhoroshev's theorem. The solutions one 
finds numerically certainly do not correspond to 
maximal tori, but one could expect that they could be 
solutions which appear to be quasiperiodic for long 
but finite times (e.g., moving near some lower- 
dimensional torus determined by the initial condi- 
tions), and that if one really insists on observing the 
time evolution for a very long time, then deviations 
from quasiperiodic behavior could be detected. This 
is an appealing interpretation, and the most recent 
numerical results make it plausible: Galgani and 
Giorgilli (2003) have found numerically that the 
energy, even if initially confined to the lower modes, 
tend to be shared among all the other modes, and 
higher the modes the longer is the time needed for the 
energy to flow to them. Of course, this does not settle 
the problem, as there is still the issue of the large 
number of degrees of freedom; furthermore, for large 
N the spacing between the frequencies is small, and 
they become almost degenerate. Hence, the problem 
still has to be considered as open. 


Stability versus Chaos 


The main problem in applying the KAM theorem 
seems to be related to the small value of the threshold 
ey which is required. In general, when the size of the 
perturbation parameter is very large, the region of 
phase space filled with invariant tori decreases (or even 
disappears), and chaotic motions appear. By the latter, 
one generally means motions which are highly 
sensitive to the initial conditions: a small variation of 
the initial conditions produces a catastrophic variation 
in the corresponding trajectories (this is due to the 
appearance of strictly positive Lyapunov exponents). 


A natural question is then how such a result as the 
KAM theorem is meaningful in physical situations: 
in other words, for which systems the KAM theorem 
can really apply. 

One of the main motivations to study such a 
problem was to explain astronomical observations 
and to study the stability of the solar system. In 
order to apply the KAM theorem to the solar 
system, one has to interpret the gravitational forces 
between the planets as perturbations of a collection 
of several decoupled two-body systems (each planet 
with the Sun). One can write the masses of the 
planets as em;, and € plays the role of the 
perturbation parameter. The corresponding Hamil- 
tonian (after suitable reductions and scalings) is 
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where i=0 corresponds to the Sun, while 
i=1,...,N correspond to the planets (hence 


N — 9), mo is the mass of the Sun, and ep; are the 
reduced masses (u;'=m;'+em,'); here (qj, pi) € 
R? x R°,i=0,..., N, the inner product in p; - p; is 
in R?, and the norm |- | is the Euclidean one. 

A first difficulty is that the solar system is a properly 
degenerate system; that is, the unperturbed Hamilto- 
nian does not depend on all the action variables. But 
such a degeneracy can be removed by performing a 
canonical change of coordinates which produces a new 
Hamiltonian in which the integrable part contains new 
terms of order e depending on all action variables and 
is nondegenerate, while the perturbation becomes of 
order &^: the angle variables corresponding to the 
actions not originally appearing in the unperturbed 
Hamiltonian are called the slow variables, while the 
others are called the fast variables. 

However, a naive implementation of the KAM 
theorem, in general, even for simplified but still 
realistic systems, would provide a preposterously 
small value of the threshold £o. The problem could 
be just a computational one: in principle, a very 
refined estimate of the threshold could give a better 
value, so that it is very difficult to decide analytically 
if the real values of the planetary masses allow the 
solar system to fall inside the regime of appli- 
cability of the KAM theorem. Results in this 
direction have been obtained, but only for special 
situations: for instance, by considering the restri- 
cted planar circular three-body problem (which 
provides a simplified description of the system 
“Sun + Jupiter + asteroid”), Celletti and Chierchia 
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(1997) found analytical bounds on the perturbation 
parameters comparable with the physical values. Of 
course, this is not at all conclusive for the general 
situation in which all planets (with their satellites 
and the asteroids) are considered together; in 
particular, it does not shed light on the problem of 
the stability of the entire solar system. 

On the contrary, extensive numerical simulations 
performed by Laskar (starting from 1989) seem to 
suggest that the solar system is unstable. Deflections 
from the current orbits could be produced to such an 
extent that collisions between planets could not be 
avoided: Mercury could collide with Venus and be 
ejected from the solar system. An important issue is 
to consider the times over which such phenomena 
can occur. Laskar’s numerical simulations show that 
such times are less than the estimated age of the solar 
system, and that one can make accurate predictions 
for the planetary motions only for a finite amount of 
time (~100 Myr). Furthermore, the assumed partial 
instability of the solar system has also been used by 
Laskar (2004) to explain some observed phenomena 
such as the evolution of the obliquity (which is the 
angle between equator and orbital plane) of some 
planets. Of course, these simulations have been 
carried out with several approximations, as that of 
averaging over the fast variables, which allows one to 
use a large integration step in the numerical integra- 
tion of the equations of motion for the resulting 
system. This is the so-called secular system intro- 
duced by Lagrange: instead of the fast motion of the 
planets, one describes the slow deformations of the 
planetary orbits (imagining the planets as regions of 
mass spread along their orbits). 


See also: Averaging Methods; Bifurcation Theory; 
Billiards in Bounded Convex Domains; Diagrammatic 
Techniques in Perturbation Theory; Dynamical Systems 
and Thermodynamics; Gravitational N-Body Problem 
(Classical); Hamiltonian Systems: Stability and Instability 
Theory; Hamilton—Jacobi Equations and Dynamical 
Systems: Variational Aspects; Integrable Systems and 
Discrete Geometry; KAM Theory and Celestial 
Mechanics; Localization for Quasiperiodic Potentials; 
Stability Problems in Celestial Mechanics; 
Synchronization of Chaos; Weakly Coupled Oscillators. 
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Introduction 


The standard model (SM) is a consistent, finite, 
and — within the limitations of our present technical 
ability — computable theory of fundamental micro- 
scopic interactions that successfully explains most 
of the known phenomena in elementary particle 
physics. The SM describes strong, electromagnetic, 
and weak interactions. All microscopic phenomena 
observed to date can be attributed to one or the 
other of these interactions. For example, the forces 
that hold together the protons and the neutrons in 


the atomic nuclei are due to strong interactions; the 
binding of electrons to nuclei in atoms or of atoms 
in molecules is caused by electromagnetism; and the 
energy production in the Sun and the other stars 
occurs through nuclear reactions induced by weak 
interactions. In principle, gravitational forces 
should also be included in the list of fundamental 
interactions but their impact on fundamental 
particle processes at accessible energies is totally 
negligible. 

The structure of the SM is a generalization of 
that of quantum electrodynamics (QED), in the 
sense that it is a renormalizable field theory based 
on a local symmetry (1.e., separately valid at each 
spacetime point x) that extends the gauge invar- 
iance of electrodynamics to a larger set of 


conserved currents and charges. There are eight 
strong charges, called “color” charges and four 
electroweak charges (which, in particular, include 
the electric charge). The commutators of these 
charges form the SU(3) @ SU(2) @ U(1) algebra. In 
QED, the interaction between two matter particles 
with electric charges (e.g., two electrons) is 
mediated by the exchange of one (or more) photons 
emitted by one electron and reabsorbed by the 
second. In the SM the matter fields, all of spin 1/2, 
are the quarks, the constituents of protons, neu- 
trons, and all hadrons, endowed with both color 
and electroweak charges, and the leptons (the 
electron e^, the muon y, the tauon 7, plus the 
three associated neutrinos Ve, v,, and v+) with no 
color but with electroweak charges. The matter 
fermions come in three generations or families with 
identical quantum numbers but different masses. 
The pattern is as follows: 
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Each family contains a weakly charged doublet of 
quarks, in three color replicas, and a colorless 
weakly charged doublet with a neutrino and a 
charged lepton. At present, there is no explanation 
for this triple repetition of fermion families. The 
force carriers, of spin 1, are the photon y, the weak 
interaction gauge bosons W+, W”, and Zo and the 
eight gluons g that mediate the strong interactions. 
The photon and the gluons have zero masses as a 
consequence of the exact conservation of the 
corresponding symmetry generators, the electric 
charge and the eight color charges. The weak 
bosons W^, W , and Zp have large masses (mw ~ 
80.4 GeV, mz — 91.2 GeV), signaling that the corre- 
sponding symmetries are badly broken. In the SM, 
the spontaneous breaking of the electroweak gauge 
symmetry is induced by the Higgs mechanism, 
which predicts the presence of one (or more) spin 0 
particles in the physical spectrum, the Higgs 
boson(s), not yet experimentally observed. A tre- 
mendous experimental effort is underway or 


planned to reveal the Higgs sector as the last crucial 
missing link in the SM verification. 
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Quantum Chromodynamics 


The statement that quantum chromodynamics 
(QCD) is a renormalizable gauge theory based on 
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the group SU(3) with color triplet quark matter 
fields fixes the QCD Lagrangian density to be 


n 


M ia -— 
feme» PB, » aD —m)q [2] 
= 


A=1 


Here q; are the quark fields (of ny different flavors) 
with mass mj; D=D,y", where y! are the Dirac 
matrices and D,, is the covariant derivative 


D, = 9, — ies X t^g. [3] 
A 


es is the gauge coupling (in analogy with QED, 


e 


= s 

Ww 4m 4 
here and throughout this article natural units, 
h=c=1, are used); FA A=1,...,8, are the gluon 
fields, and 1% are the SU(3) group generators in the 
triplet representation of quarks (i.e., t4 are 3 x 3 
matrices acting on q); the generators obey the 
commutation relations [t4,t?]=iCagct©, where 
Casc are the complete antisymmetric structure 
constants of SU(3) (the normalization of Cagc and 
of e, is specified by tr[t422] = 1/26^P); 


ES, = 0,8? — Og " es CABCE, E [5] 


The physical vertices in QCD include the gluon- 
quark-antiquark vertex, analogous to the QED 
photon-fermion-antifermion coupling, but also the 
three-gluon and four-gluon vertices, of order e, and 
e, respectively, which have no analog in an abelian 
theory like QED. In QED, the photon (a neutral 
particle) is coupled to all electrically charged 
particles. In QCD, the gluons are colored, 
hence self-coupled. This is reflected in the fact that 
in QED F,, is linear in the gauge field, so that the 
term F? in the Lagrangian is a pure kinetic term, 
while in QCD F^, is quadratic in the gauge field, so 
that in Fi? we find cubic and quartic vertices 
beyond the kinetic term. 

The QCD Lagrangian in eqn [2] has a simple 
structure but a very rich dynamical content, includ- 
ing the observed complex spectroscopy with a large 
number of hadrons. The most prominent properties 
of QCD are asymptotic freedom and confinement. 
In field theory, the effective coupling of a given 
interaction vertex is modified by the interaction. As 
a result, the measured intensity of the force depends 
on the transferred (four)momentum squared, O7, 
among the participants. In QCD, the relevant 
coupling parameter that appears in physical pro- 
cesses is œ; (see eqn [4]). Asymptotic freedom means 
that the effective coupling becomes a function of 
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Q?: as(O?) decreases for increasing O? and vanishes 
asymptotically. Thus, the QCD interaction becomes 
very weak in processes with large O7, called hard 
processes or deep inelastic processes (i.e., with a 
final-state distribution of momenta and a particle 
content very different from that in the initial state). 
One can prove that in four spacetime dimensions all 
gauge theories based on a noncommuting group of 
symmetry are asymptotically free, and conversely. 
The effective coupling decreases very slowly at large 
momenta with the inverse logarithm of ỌQ?: 
a,(O7) — 1/blog O?/A?, where b is a known con- 
stant and A is an energy of the order of a few 
hundred MeV. Since in quantum mechanics large 
momenta imply short wavelengths, the result is that 
at short distances the potential between two color 
charges is similar to the Coulomb potential, that is, 
proportional to a,(r)/r, with an effective color 
charge which is small at short distances. On the 
contrary the interaction strength becomes large at 
large distances or small transferred momenta, of 
order O < A. In fact, the observed hadrons are tightly 
bound composite states of quarks, with compensating 
color charges so that they are overall neutral in color. 
The property of confinement is the impossibility of 
separating color charges, like individual quarks and 
gluons. This is because in QCD the interaction 
potential between color charges increases, at long 
distances, linearly in r. When we try to separate the 
quark and the antiquark that form a color-neutral 
meson the interaction energy grows until pairs of 
quarks and antiquarks are created from the vacuum 
and new neutral mesons are coalesced instead of free 
quarks. For example, consider the process e*e^ — qq 
at large center-of-mass energies. The final-state quark 
and antiquark have large energies, so they separate in 
opposite directions very fast. But the color-confine- 
ment forces create new pairs in between them. Two 
back-to-back jets of colorless hadrons are observed 
with a number of slow pions that make the exact 
separation of the two jets impossible. In some 
cases, a third well-separated jet of hadrons is also 
Observed: these events correspond to the radiation 
of an energetic gluon from the parent quark- 
antiquark pair. 


Electroweak Interactions 


We split the electroweak Lagrangian into two parts 
by separating the Higgs boson couplings: 


L= Lsymm va LHigss [6] 


We start by specifying Lsymm, which involves only 
gauge bosons and fermions (a sum over all flavors of 


quarks and leptons, generally indicated by y is 
understood): 


A pAuv 
Lymm = = 44 2» E m 


T piy” D, VL + igi?" D, WR [7] 
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This is the Yang-Mills Lagrangian for the gauge 
group SU(2) & U(1) with fermion matter fields. Here 


Bu = 0,B, ~ O,B,, 
FL-0,W,-86,W5 


m» 


— gcanc W^ Wy à 
are the gauge antisymmetric tensors constructed out 
of the gauge field B, associated with U(1), and wi 
corresponding to the three SU(2) generators; €ABC 
are the group structure constants (see eqn [11]), 
which, for SU(2), coincide with the totally antisym- 
metric Levi-Civita tensor (recall the familiar 
angular-momentum commutators). 

The fermion fields are described through their 
left- and right-hand components: 


T 35)/2]v. Vig =V[(1 + 75)/2] [9] 


Note that, as given in eqn [9], 


YR = [(1 


DL = Vivo = ya — *5)/2]no 
= V[vo(1 — 75)/2170 = v[(1 + 5s)/2] 


The matrices P. = (1 + y5)/2 are projectors. They 
satisfy the relations PP, =P, Ps.Ps=0, 
P,+P..=1. 

The standard electroweak theory is a chiral 
theory, in the sense that v, and wg behave 
differently under the gauge group. In particular, all 
vg are singlets and all vy are doublets in the 
minimal SM (MSM). Thus, mass terms for fermions 
(of the form wvvg-- h.c.) are forbidden in the 
symmetric limit. Fermion masses are introduced, 
together with W* and Z masses, by the mechanism 
of symmetry breaking. The covariant derivatives 
D Yı, are explicitly given by 


D, R 


3 
ig tir Wi 
A=1 
where tf and 1/2Y¡r are the SU(2) and U(1) 
generators, respectively, in the reducible representa- 
tions Y. The commutation relations of the SU(2) 
generators are given by 


| 
+ 85 YLRB» WLR [10] 


[4 ste] = icAnctt. and te, tg = ICABCIR [11] 


We use the normalization tr[t4t?]= 1/26^? in the 
fundamental representation of SU(2). The electric 


charge generator O (in units of e, the positron 
charge) is given by 


O=4+1/2Y,=# +1/2Yp [12] 


All fermion couplings to the gauge bosons can be 
derived directly from eqns [7] and [10]. The charged- 
current (CC) couplings are the simplest. From 


=gf | (r +12) /v2] 
x | (wi -iw2)/v2] +h.c.) 
=s{ (ew; v2] +h.c.} [13] 


where t#=t!+it? and W*=(W! +iW?)/v2, we 
obtain the vertex 


Viww — gy. IG / v2)(1 —15)/2+ (si / v2) 
1+45) /2| YW 4- h.c. [14] 


g(r Ww; ES ewe) 


In the neutral-current (NC) sector, the photon A,, 
and the mediator Z,, of the weak NC are orthogonal 
and normalized linear combinations of B,, and We: 


Ay = cos Ow B, + sin Ow w^ 


| à [15] 
Zya = —sinOwB, + cos Ow Ww 


Equations [15] define the weak mixing angle Oy. 
The photon is characterized by equal couplings to 
left and right fermions with a strength equal to the 
electric charge. Recalling eqn [12] for the charge 
matrix O, we immediately obtain 


gsinÜw = g’ cos by =e [16] 


or, equivalently, 


tan Ow = g'/g [17] 


Once Ow has been fixed by the photon couplings, it 
is a simple matter of algebra to derive the Z 
couplings, with the result 


Diz =g/(2. cos Ow)ury, [tf (1 — ys) + tà (1 + 5) 
-20 sin” Ow] v Z^ [18] 


where DL';;,7 is a notation for the vertex. In the 
MSM, t; =0 and tj = +1/2. Note that the CC and 
NC weak couplings do not conserve P (parity) and C 
(charge conjugation). 

In order to derive the effective four-fermion 
interactions that are equivalent, at low energies, to 
the CC and NC couplings given in eqns [14] and 
[18], we anticipate that large masses, as experimen- 
tally observed, are provided for W* and Z by Lriges. 
For left-left CC couplings, when the momentum 
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transfer squared can be neglected with respect to 
mi in the propagator of Born diagrams with single 


W exchange, from eqn [14], we can write 


a ES = (g^ /8my) [V1 — ¥5)ty | 
x [y^ (1 — ystry] [19] 


By specializing further in the case of doublet fields 
such as v, —e or v, — ft , we obtain the tree-level 
relation of g with the Fermi coupling constant 
Gp measured from p decay (Gp=1.16639(2)x 
107? GeV 7): 


Gr/ V2 = g^ /8miy [20] 


By recalling that gsinOw=e, we can also cast this 
relation in the form 


my = HBorn/ Sin Ow [21] 


with 
1/2 
ias (ra/v2Gr) ~ 37.2802GeV [22] 


where a is the fine-structure constant of QED 
(a = e* /4n=1/137.036). 

In the same way, for neutral currents we obtain, 
in Born approximation, from eqn [18], the effective 
four-fermion interaction given by 


Lor c V2Grpodyl-. Jhi . -]v [23] 


where 


[...] =t2(1 —ys) + (1 +75) -20sin^0y [24] 


and 
po = My [ m2, cos? Oy [25] 


All couplings given in this section are obtained at 
tree level and are modified in higher orders of 
perturbation theory. In particular, the relations 
between my and sinw (eqns [21] and [22]) and 
the observed values of p (p— po at tree level) in 
different NC processes are altered by computable 
small electroweak radiative corrections. 

The gauge-boson self-interactions can be derived 
from the F,, term in Lsymm, by using eqn [15] and 
W* -—(W!-iW2)/42. For the three-gauge-boson 
vertex W*^W V with V=Z, y, we obtain 


Pw-w. | = Igw-w-«v [uv (q a p) + gur(p ~~ r), 
+ [64 E q),] [26] 
with 
g£w-w-4,-— gsinÜw =e and 


[27] 
gw-wiz = gcos byw 
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This form of the triple gauge vertex is very special: in 
general, there could be departures from the above SM 
expression, even restricting us to SU(2) & U(1) gauge 
symmetric and C and P invariant couplings. In fact, 
some small corrections are already induced by the 
radiative corrections. The SM form of the triple gauge 
vertex has been experimentally confirmed by measur- 
ing the cross section ete” — W* W” at LEP. 

We now turn to the Higgs sector of the electro- 
weak Lagrangian. The Higgs Lagrangian is specified 
by the gauge principle and the requirement of 
renormalizability to be 


(D.o) (Dre) — V(9!ó) — dP urd 
— ig T uo! [28] 


É Higgs = 


where ó is a column vector including all Higgs 
scalar fields; it transforms as a reducible representa- 
tion of the gauge group. The quantities [ (which 
include all coupling constants) are matrices that 
make the Yukawa couplings invariant under the 
Lorentz and gauge groups. The potential V(ó!ó), 
symmetric under SU(2) & U(1), contains, at most, 
quartic terms in ó so that the theory is 
renormalizable: 


V(oó) = —1,?9ó--iA(io) 29) 


Spontaneous symmetry breaking is induced if the 
minimum of V, which is the classical analog of 
the quantum-mechanical vacuum state (both are the 
states of minimum energy) is obtained for nonvan- 
ishing @ values. This occurs because we have taken 
u^ and A positive in V (note the *wrong" sign of the 
mass term). Precisely, we denote the vacuum 
expectation value (VEV) of o, that is, the position 
of the minimum, by v: 


(0](x)|0) — v 4 0 [30] 


The fermion mass matrix is obtained from the 
Yukawa couplings by replacing ó(x) by v: 


M = ji Mg + Vg MI [31] 
with 


In the SM, where all left fermions, Y, are doublets 
and all right fermions, vg, are singlets, only Higgs 
doublets can contribute to fermion masses. There 
are enough free couplings in T, so that one single 
complex Higgs doublet is indeed sufficient to 
generate the most general fermion mass matrix. It 
is important to observe that by a suitable change of 
basis we can always make the matrix M Hermitian, 


ys-free and diagonal. In fact, we can make separate 
unitary transformations on v, and vg according to 


Ur = Ux. Ug = Vn [33] 
and consequently 
M — M'=U'MV [34] 


This transformation does not alter the general 
structure of the fermion couplings in Lsymm- 

If only one Higgs doublet is present, the change of 
basis that makes M diagonal will at the same time 
diagonalize also the fermion-Higgs Yukawa cou- 
plings. Thus, in this case, no flavor-changing neutral 
Higgs exchanges are present. This is not true, in 
general, when there are several Higgs doublets. But 
one Higgs doublet for each electric charge sector, 
that is, one doublet coupled only to z-type quarks, 
one doublet to d-type quarks, one doublet to charged 
leptons would also be satisfactory, because the mass 
matrices of fermions with different charges are 
diagonalized separately. In fact, at the moment, the 
simplest model with only one Higgs doublet seems 
adequate for describing all observed phenomena. 

Weak charged currents are the only tree-level 
interactions in the SM that change flavor: by 
emission of a W, a u-type quark is turned into a 
d-type quark, or a vı neutrino is turned into an 
l^ charged lepton (all fermions are left-handed). If 
we start from a u-type quark that is a mass 
eigenstate, emission of a W turns it into a d-type 
quark state d' (the weak isospin partner of u) that in 
general is not a mass eigenstate. In general, the mass 
eigenstates and the weak eigenstates do not coincide 
and a unitary transformation connects the two sets: 


d' d 
¢ |=V | s [35] 
p! b 


or, in shorthand, D' — VD, where V is the Cabibbo- 
Kobayashi-Maskawa (CKM) matrix. Thus, in terms 
of mass eigenstates the charged weak current of 
quarks is of the form 


J, « y, (1 — ys) VD [36] 


Since V is unitary (i.e., VV! = V! V = 1) and commu- 
tes with T?, T3, and Q (because all d-type quarks 
have the same isospin and charge) the neutral current 
couplings are diagonal both in the primed and 
unprimed basis (if the Z d-type quark current is 
abbreviated as D'TD' then by changing basis we get 
DV!TVD and V and T commute because, as seen 
from eqn [24], T is made of Dirac matrices and T3 and 
Q generator matrices). It follows that D'TD'= DTD. 
This is the Glashow-lIliopoulos-Maiani (GIM) 


mechanism that ensures natural flavor conservation 
of the neutral current couplings at the tree level. For 
three generations of quarks, the CKM matrix depends 
on four physical parameters: three mixing angles and 
one phase. This phase is the unique source of CP 
violation in the SM. 

We now consider the gauge-boson masses and their 
couplings to the Higgs. These effects are induced by 
the (D,,¢)'(D"¢) term in Lyiges (eqn [28]), where 


3 
D,ó- |ð, -igY WI + ig (Y/2)B,16 — [37 
A=1 


Here t^ and 1/2Y are the SU(2) @ U(1) generators in 
the reducible representation spanned by à. Not only 
doublets but all non-singlet Higgs representations can 
contribute to gauge-boson masses. The condition that 
the photon remains massless is equivalent to the 
condition that the vacuum is electrically neutral: 


Ol») = (+ +3Y)|v) =0 [38] 


The charged W mass is given by the quadratic terms 
in the W field arising from Lyige,, when (x) is 
replaced by v. We obtain 


2 
my WW" —g(rvjv2)| wiw [89 
whilst for the Z mass we get (recalling eqn [15]) 
5m>Z,Z! ES | [gcos Owt? 
—g sin Ow (Y /2)| AA [40] 


where the factor of 1/2 on the left-hand side is the 
correct normalization for the definition of the mass 
of a neutral field. For Higgs doublets 


- 0 
(E) e) om 
we obtain 


mh —1/2g^v, m; = 1/2g*v"/cos* Ow [42] 
Note that by using eqn [20] we obtain 
y —2 39^gG.7* = 174.1 GeV [43] 
It is also evident that for Higgs doublets 
po = my [m2 cos* Oy = 1 [44] 


This relation is typical of one or more Higgs doublets 
and would be spoiled by the existence of, for example, 
Higgs triplets. This result is valid at the tree level and is 
modified by calculable small electroweak radiative 
corrections. The po parameter has been measured from 
the intensity of NC interactions (recall eqn [25]) and 
confirmed to be close to unity at a few per milli level. 
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In MSM only one Higgs doublet is present. Then the 
fermion-Higgs couplings are in proportion to the 
fermion masses. In fact, from the Yukawa couplings 
87, (LO fr + h.c.), the mass my is obtained by replacing 
$ by v, so that my = g,;;V. In MSM, three out of the 
four Hermitian fields are removed from the physical 
spectrum by the Higgs mechanism and become the 
longitudinal modes of W*,W' , and Z which acquire a 
mass. The fourth neutral Higgs is physical and should 
be found. If more doublets are present, two more 
charged and two more neutral Higgs scalars should be 
around for each additional doublet. 

The couplings of the physical Higgs H to the 
gauge bosons can be simply obtained from Lyiges, by 
the replacement 


_ (0 (x) 0 
a= (S) > orava) S 
(so that (D,,) (D“¢) =1/2(0,H)? +---), with the 


result 
LIH, W, Z] 
= g (v/V2) W: W-"H + (¢°/4) Wi WH? 
+ [5 vZ,Z")/ (22 cos? 23] H 
+ [g?/(8 cos? 8y)] Z,Z" H? 


In MSM, the Higgs mass mẹ ~ Av? is of order of 
the weak scale v but cannot be predicted because the 
value of A is not fixed. The dominant decay mode of 
the Higgs is in the bb channel below the WW 
threshold, while the W+ W- channel is dominant for 
sufficiently large my. The width is small below the 
WW threshold, not exceeding a few MeV, but 
increases steeply beyond the threshold, reaching the 
asymptotic value of I ~ 1/2mj, at large my, where 
all energies and masses are in TeV. 

A central role in the experimental verification of 
the standard electroweak theory has been played by 
CERN, the European Laboratory for Particle Physics, 
located near Geneva, between France and Switzer- 
land. The indirect effects of the Zo, that is, the 
occurrence of weak processes induced by the neutral 
current, were first observed in 1974 at CERN by the 
Collaboration Gargamelle (the name of the bubble 
chamber used in the experiment). Later, in 1982, the 
W- and the Zo were, for the first time, directly 
produced and observed in proton-antiproton colli- 
sions by the UA1 and UA2 collaborations and then 
further studied with the same technique both at 
CERN and subsequently at the Tevatron of Fermilab 
near Chicago. Starting from 1989 LEP, the large e*e” 
collider was functioning at CERN till 2000. In the LEP 
circular ring of circumference —27 km, electrons and 
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positrons were accelerated in opposite directions to an 
equal energy in the range between 45 and 103 GeV. 
The beams were made to cross and collide in 
correspondence of four experimental areas where the 
ALEPH, DELPHI, L3, and OPAL detectors were 
located to study the final states produced in the 
collisions. In its first phase, called LEP1, from 1989 
to 1995 the LEP operation had been completely 
dedicated to a precise study of the Zp properties, 
mass, lifetime, and decay modes in order to accurately 
test the predictions of the SM. The main lessons of the 
precision tests of the standard electroweak theory can 
be summarized as follows. It has been checked that the 
couplings of quarks and leptons to the weak gauge 
bosons W* and Z are indeed precisely those prescribed 
by the gauge symmetry. The accuracy of a few tenths 
of 196 for these tests implies that, not only the tree 
level, but also the structure of quantum corrections has 
been verified. Then, since the end of 1995, the energy 
of LEP was increased and the phase of LEP2 was 
started. The total energy was gradually increased up to 
206 GeV. The main physics goals of LEP2 were the 
search for the Higgs and for possible new particles, the 
precise measurement of mw and the experimental 
study of the triple gauge vertices WWy and WWZo. 
The Higgs particle of the SM could in principle be 
produced at LEP2 in the reaction e+e — ZoH, 
which proceeds by Zo exchange. The nonobservation 
of the Higgs particle at LEP2 has allowed to establish a 
lower limit on its mass: my>114GeV. Indirect 
indications on the Higgs mass were also obtained 
from the precision tests of the SM, as the radiative 
effects depend logarithmically on my. The indication 
is that the Higgs mass cannot be too heavy if the SM is 
valid: my € 219 GeV at 95% c.l. In 2001, LEP was 
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Introduction 


This article treats a specific class of stationary 
solutions to the Einstein field equations which read 


1 8TG 
Rw = 7 &wR = A iw [1] 


Here R,, and R =g" R,» are, respectively, the Ricci 
tensor and the Ricci scalar of the spacetime metric 
uv, G the Newton constant, and c the speed of light. 


dismantled and, in its tunnel, a new double ring of 
superconducting magnets is being installed. The new 
accelerator, the LHC (Large Hadron Collider), will be 
a proton-proton collider of total center-of-mass 
energy 14 TeV. Two large experiments ATLAS and 
CMS will continue to search for the Higgs starting in 
the year 2007. The sensitivity of LHC experiments to 
the SM Higgs will go up to masses my of ~1 TeV. 


See also: Effective Field Theories; Electric-Magnetic 
Duality; Electroweak Theory; General Relativity: 
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The tensor T, is the stress-energy tensor of matter. 
Spacetimes, or regions thereof, where T,,—0 are 
called vacuum. 

Stationary solutions are of interest for a variety 
of reasons. As models for compact objects at rest, 
or in steady rotation, they play a key role in 
astrophysics. They are easier to study than nonsta- 
tionary systems because stationary solutions are 
governed by elliptic rather than hyperbolic equa- 
tions. Finally, like in any field theory, one expects 
that large classes of dynamical solutions approach 
("settle down to") a stationary state in the final 
stages of their evolution. 

The simplest stationary solutions describing com- 
pact isolated objects are the spherically symmetric 


ones. In the vacuum region, these are all given by the 
Schwarzschild family. A theorem of Birkhoff shows 
that in the vacuum region any spherically symmetric 
metric, even without assuming stationarity, belongs to 
the family of Schwarzschild metrics, parametrized by a 
positive mass parameter m. Thus, regardless of 
possible motions of the matter, as long as they remain 
spherically symmetric, the exterior metric is the 
Schwarzschild one for some constant m. This has the 
following consequence for stellar dynamics: imagine 
following the collapse of a cloud of pressureless fluid 
(“dust”). Within Newtonian gravity, this dust cloud 
will, after finite time, contract to a point at which the 
density and the gravitational potential diverge. How- 
ever, this result cannot be trusted as a sensible physical 
prediction because, even if one supposes that New- 
tonian gravity is still valid at very high densities, a 
matter model based on noninteracting point particles 
is certainly not. Consider, next, the same situation in 
the Einstein theory of gravity: here a new question 
arises, related to the form of the Schwarzschild metric 
outside of the spherically symmetric body: 


=V de + V^* di? +r dO. 


g = 
rc 
teR, re Ex) [2] 
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Here dQ? is the line element of the standard 
2-sphere. Since the metric [2] seems to be singular as 
r —2m is approached (from now on, we use units in 
which G =c=1), there arises the need to understand 
what happens at the surface of the star when the 
radius r — 271 is reached. One thus faces the need of 
a careful study of the geometry of the metric [2] 
when r=2m is approached, and crossed. 

The first key feature of the metric [2] is its 
stationarity, of course, with Killing vector field X 
given by X=0,. A Killing field, by definition, is a 
vector field the local flow of which generates isome- 
tries. A spacetime (the term spacetime denotes a 
smooth, paracompact, connected, orientable, and 
time-orientable Lorentzian manifold) is called station- 
ary if there exists a Killing vector field X which 
approaches 0; in the asymptotically flat region (where r 
goes to oo; see below for precise definitions) and 
generates a one-parameter group of isometries. A 
spacetime is called static if it is stationary and if the 
stationary Killing vector X is hypersurface orthogonal, 
that is, X° A dX? = 0, where X’ = X, dx" = g, X" dx". 
A spacetime is called axisymmetric if there exists a 
Killing vector field Y, which generates a one-parameter 
group of isometries and which behaves like a rotation 
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in the asymptotically flat region, with all orbits 
2rr-periodic. In asymptotically flat spacetimes, this 
implies that there exists an axis of symmetry, that is, a 
set on which the Killing vector vanishes. Killing vector 
fields which are a nontrivial linear combination of a 
time translation and of a rotation in the asymptotically 
flat region are called stationary rotating, or helical. 

There exists a technique, due independently to 
Kruskal and Szekeres, of attaching together two 
regions r>2m and two regions r<2m of the 
Schwarzschild metric, as in Figure 1, to obtain a 
manifold with a metric which is smooth at r= 2m. 
In the extended spacetime, the hypersurface {r = 2m} 
is a null hypersurface &, the Schwarzschild event 
horizon. The stationary Killing vector X=0, 
extends to a Killing vector in the extended spacetime 
which becomes tangent to and null on &. The global 
properties of the Kruskal-Szekeres extension of the 
exterior Schwarzschild spacetime make this spacetime 
a natural model for a nonrotating black hole. It is 
worth noting here that the exterior Schwarzschild 
spacetime [2] admits an infinite number of noniso- 
metric vacuum extensions, even in the class of 
maximal, analytic, simply connected ones. The 
Kruskal-Szekeres extension is singled out by the 
properties that it is maximal, vacuum, analytic, simply 
connected, with all maximally extended geodesics 
either complete, or with the area r of the orbits of the 
isometry groups tending to zero along them. 

We can now come back to the problem of the 
contracting dust cloud according to the Einstein 
theory. For simplicity, we take the density of the 
dust to be uniform - the so-called Oppenheimer- 
Snyder solution. It then turns out that, in the course 
of collapse, the surface of the dust will eventually 
cross the Schwarzschild radius, leaving behind a 
Schwarzschild black hole. If one follows the dust 
cloud further, a singularity will eventually form, but 
will not be visible from the *outside region" where 
r » 2m. For a collapsing body of the mass of the 
Sun, say, one has 2m=3km. Thus, standard 
phenomenological matter models such as that for 
dust can still be trusted, so that the previous 
objection to the Newtonian scenario does not apply. 

There is a rotating generalization of the Schwarz- 
schild metric, namely the two-parameter family of 
exterior Kerr metrics, which in Boyer-Lindquist 
coordinates takes the form 


7 A-a sin 8 y  2asin^6( +a — A) 


232. Ar gis? 
(r +a’) = sin vd 
(xd + Ede i3 
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Figure 1 The Kruskal-Szekeres extension of the Schwarzschild solution. (Adapted with permission from Nicolas J-P (2002) Dirac fields 
on asymptotically flat space-times. Dissertationes Mathematicae 408: 1—85.) 


with 0<a<m. Here X —r?--a?cos? 0, ^ — 7? -- a? — 
2mr and r,<r<oo where r, =m + (m? — aq?) ^*, 
When 22-0, the Kerr metric reduces to the 
Schwarzschild metric. The Kerr metric is again a 
vacuum solution, and it is stationary with X =0, the 
asymptotic time translation, as well as axisymmetric 
with Y=0, the generator of rotations. Similarly to 
the Schwarzschild case, it turns out that the metric 
can be smoothly extended across r=r,, with {r=r,} 
being a smooth null hypersurface & in the extension. 
The null generator K of 4 is the limit of the 
stationary-rotating Killing field X-+wY, where 
w=a/(2mr,). On the other hand, the Killing vector 
X is timelike only outside the hypersurface {r =m + 
(mi? — a? cos? 0)! ^], on which X becomes null. In the 
region between r} and r=m + (m? — a? cos? a? 
which is called the ergoregion, X is spacelike. It is 
also spacelike on and tangent to 6, except where the 
axis of rotation meets &, where X is null. Based on 
the above properties, the Kerr family provides 
natural models for rotating black holes. 

Unfortunately, as opposed to the spherically 
symmetric case, there are no known explicit collap- 
sing solutions with rotating matter, in particular no 
known solutions having the Kerr metric as final 
state. 

The aim of the theory outlined below is to 
understand the general geometrical features of 


stationary black holes, and to give a classification 
of models satisfying the field equations. 


"Model-Independent Concepts 


Some of the notions used informally in the 
introductory section will now be made more 
precise. The mathematical notion of black hole is 
meant to capture the idea of a region of spacetime 
which cannot be seen by “outside observers." Thus, 
at the outset, one assumes that there exists a family 
of physically preferred observers in the spacetime 
under consideration. When considering isolated 
physical systems, it is natural to define the “exterior 
observers” as observers which are “very far” away 
from the system under consideration. The standard 
way of making this mathematically precise is by 
using conformal completions, discussed in more 
detail in the article about asymptotic structure in 
this encyclopedia: a pair (.Z/,g) is called a con- 
formal completion at infinity, or simply conformal 
completion, of (.4,g) if .W is a manifold with 
boundary such that: 


1. M is the interior of M; 

2. there exists a function Q, with the property that 
the metric g, defined as 27g on M, extends by 
continuity to the boundary of .Z/, with the 


extended metric remaining of Lorentzian signa- 
ture; and 

3. Q is positive on .Z, differentiable on .Z/, vanishes 
on the boundary 


I :=M\M 
with dQ nowhere vanishing on 4. 


The boundary .4 of M is called Scri, a phonic 
shortcut for “script I.” The idea here is the 
following: forcing Q to vanish on . ensures that ./ 
lies infinitely far away from any physical object — a 
mathematical way of capturing the notion “very far 
away." The condition that dQ“. does not vanish is a 
convenient technical condition which ensures that ./ 
is a smooth three-dimensional hypersurface, instead 
of some, say, one- or two-dimensional object, or of a 
set with singularities here and there. Thus, ./ is an 
idealized description of a family of observers at 
infinity. 

To distinguish between various points of ./, one 
sets 


J' = (points in ./ which are to the future of the 
physical spacetime } 
J = [points in ./ which are to the past of the 


physical spacetime} 


(Recall that a point g is to the future, respectively to 
the past, of p if there exists a future directed, 
respectively past directed, causal curve from p to q. 
Causal curves are curves y such that their tangent 
vector ^ is causal everywhere, g(3, y) € 0.) One 
then defines the black hole region # as 


B := [points in .4/ which are 4 
not in the past of .4*) 


By definition, points in the black hole region cannot 
thus send information to .4*; equivalently, observers 
on ./* cannot see points in 42. The white-hole region 
Wis defined by changing the time orientation in [4]. 
A key notion related to the concept of a black hole is 
that of future (6^) and past (& ) event horizons, 


ó" := OB, 6 :=0W [5] 


Under mild assumptions, event horizons in station- 
ary spacetimes with matter satisfying the null-energy 
condition, 


T," 7 O0 for all null vectors ¢ [6] 


are smooth null hypersurfaces, analytic if the metric 
is analytic. 

In order to develop a reasonable theory, one 
also needs a regularity condition for the interior of 
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spacetime. This has to be a condition which does not 
exclude singularities (otherwise the Schwarzschild 
and Kerr black holes would be excluded), but which 
nevertheless guarantees a well-behaved exterior 
region. One such condition, assumed in all the 
results described below, is the existence in Æ of an 
asymptotically flat spacelike hypersurface Y with 
compact interior. Further, either Y has no boundary 
or the boundary of Z lies on $* US”. To make 
things precise, for any spacelike hypersurface let gj 
be the induced metric, and let K;; denote its extrinsic 
curvature. A spacelike hypersurface Sext diffeo- 
morphic to R? minus a ball will be called asympto- 
tically flat if the fields (g;, K;) satisfy the fall-off 


conditions 


lg — jl + rlOvgi| +--+ + 7^ |05. gii 
+ r|Ki;| de] = r? |ð- ,Kij| < Cr! [7] 


for some constants C, k > 1. A hypersurface (with 
or without boundary) will be said to be asymptotically 
flat with compact interior if Y is of the form Sim U 
F ext With Ying compact and Sext asymptotically flat. 

There exists a canonical way of constructing a 
conformal completion with good global properties 
for stationary spacetimes which are asymptotically 
flat in the sense of [7], and which are vacuum 
sufficiently far out in the asymptotic region. This 
conformal completion is referred to as the standard 
completion and will be assumed from now on. 

Returning to the event horizon 6=é*Ué , 
it is not very difficult to show that every Killing 
vector field X is necessarily tangent to &. Since 
the latter set is a null Lipschitz hypersurface, it 
follows that X is either null or spacelike on &. This 
leads to a preferred class of event horizons, called 
Killing horizons. By definition, a Killing horizon 
associated with a Killing vector K is a null hypersur- 
face which coincides with a connected component of 
the set 


H(K) :— (p € M: g(K, K)(p) 20, K(p) #0} [8] 


A simple example is provided by the “boost Killing 
vector field" K = zà, + tQ; in Minkowski spacetime: 
A(K) has four connected components, 


Ha:—it-ea60t»50), eócitl) 


The closure H of H is the set {|t| = |z|}, which is not 
a manifold, because of the crossing of the null 
hyperplanes {t= +z} at t=z=0. Horizons of this 
type are referred to as bifurcate Killing horizons, 
with the set (K(p) —0] being called the bifurcation 
surface of H(K). The bifurcate horizon structure in 
the Kruszkal-Szekeres-Schwarzschild spacetime can 
be clearly seen in Figures 1 and 2. 
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The Vishveshwara—Carter lemma shows that if a 
Killing vector K is hypersurface orthogonal, K’ A 
dK’ — 0, then the set H(K) defined in [8] is a union 
of smooth null hypersurfaces, with K being 
tangent to the null geodesics threading H (“H is 
generated by K"), and so is indeed a Killing 
horizon. It has been shown by Carter that the 
same conclusion can be reached if the hypothesis 
of hypersurface orthogonality is replaced by that 
of existence of two linearly independent Killing 
vector fields. 

In stationary-axisymmetric spacetimes, a Killing 
vector K tangent to the generators of a Killing 
horizon H can be normalized so that K — X +wY, 
where X is the Killing vector field which asymptotes 
to a time translation in the asymptotic region, and Y 
is the Killing vector field which generates rotations 
in the asymptotic region. The constant w is called 
the angular velocity of the Killing horizon H. 

On a Killing horizon H(K), one necessarily has 


V"(K"K,) = —2KK" (9] 


Assuming the so-called dominant-energy condition 
on Ti, (see Positive Energy Theorem and Other 
Inequalities in GR), it can be shown that « is constant 
(recall that Killing horizons are always connected in 
the terminology used in this article); it is called the 
surface gravity of H. A Killing horizon is called 
degenerate when 4-0, and nondegenerate other- 
wise; by an abuse of terminology, one similarly talks 
of degenerate black holes, etc. In Kerr spacetimes we 
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have &—0 if and only if m=a. A fundamental 
theorem of Boyer shows that degenerate horizons 
are closed. This implies that a horizon H(K) such that 
K has zeros in H is nondegenerate, and is of bifurcate 
type, as described above. Further, a nondegenerate 
Killing horizon with complete geodesic generators 
always contains zeros of K in its closure. However, it 
is not true that existence of a nondegenerate horizon 
implies that of zeros of K: take the Killing vector field 
zô, + 10, in Minkowski spacetime from which the 
2-plane {z=t=0} has been removed. The universal 
cover of that last spacetime provides a spacetime in 
which one cannot restore the points which have been 
artificially removed, without violating the manifold 
property. 

The domain of outer communications (DOC) of a 
black hole spacetime is defined as 


UMY) := M\{BUW} 110] 


Thus, ((.@)) is the region lying outside of the white- 
hole region and outside of the black hole region; it is 
the region which can both be seen by the outside 
observers and influenced by those. 

The subset of ((.//)) where X is spacelike is called 
the ergoregion. In the Schwarzschild spacetime, 
w=0 and the ergoregion is empty, but neither of 
these is true in Kerr with a Æ 0. 

A very convenient method for visualizing the 
global structure of spacetimes is provided by the 
Carter-Penrose diagrams. An example of such a 
diagram is given in Figure 2. 
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Figure 2 The Carter-Penrose diagram for the Kruskal-Szekeres spacetime. There are actually two asymptotically flat regions, with 
corresponding .4* and é* defined with respect to the second region, but not indicated on this diagram. Each point in this diagram represents 
a two-dimensional sphere, and coordinates are chosen so that light cones have slopes +1. Regions are numbered as in Figure 1. (Adapted 
with permission from Nicolas J-P (2002) Dirac fields on asymptotically flat space-times. Dissertationes Mathematicae 408: 1—85.) 


A corollary of the topological censorship theorem 
of Friedman, Schleich, and Witt is that DOCs of 
regular black hole spacetimes satisfying the domi- 
nant-energy condition are simply connected. This 
implies that connected components of event hor- 
izons in stationary spacetimes have R xs? 
topology. 

The discussion of the concepts associated with 
stationary-black hole spacetimes can be concluded 
by summarizing the properties of the Schwarzs- 
child and Kerr geometries: the extended 
Kerr spacetime with m >a is a black hole space- 
time with the hypersurface {r=r,} forming a 
nondegenerate, bifurcate Killing horizon generated 
by the vector field X+wY and surface gravity 
given by 

(m? — a2)" 


fy E M 
2m|m + (m? — a2)'/*) 


In the case a=0, where the angular velocity w 
vanishes, X is hypersurface orthogonal and becomes 
the generator of H. The bifurcation surface in this 
case is the totally geodesic 2-sphere, along which the 
four regions in Figure 1 are joined. 


Classification of Stationary Solutions 
(*No-Hair Theorems”) 


We confine attention to the “outside region” of 
black holes, the DOC. (Except for the degenerate 
case discussed later, the “inside”(black hole) 
region is not stationary, so that this restriction 
already follows from the requirement of stationar- 
ity.) For reasons of space, we only consider 
vacuum solutions; there exists a similar theory 
for electro-vacuum black holes. (There is a some- 
what less developed theory for black hole space- 
times in the presence of nonabelian gauge fields.) 
In connection with a collapse scenario, the vacuum 
condition begs the question: collapse of what? The 
answer is twofold: first, there are large classes of 
solutions of Einstein equations describing pure 
gravitational waves. It is believed that sufficiently 
strong such solutions will form black holes. 
(Whether or not they will do that is related to the 
cosmic censorship conjecture, see Spacetime Toplogy, 
Casual Structure and Singularities.) Consider, next, a 
dynamical situation in which matter is initially present. 
The conditions imposed in this section correspond 
then to a final state in which matter has either been 
radiated away to infinity, or has been swallowed by 
the black hole (as in the spherically symmetric 
Oppenheimer-Snyder collapse described above). 
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Based on the facts below, it is expected that the 
DOCs of appropriately regular, stationary, vacuum 
black holes are isometrically diffeomorphic to those 
of Kerr black holes: 


1. The rigidity theorem (Hawking). Event horizons in 
regular, nondegenerate, stationary, analytic 
vacuum black holes are either Killing horizons for 
X, or there exists a second Killing vector in ((.4/)). 

2. The Killing horizons theorem (Sudarsky—Wald). 
Nondegenerate stationary vacuum black holes 
such that the event horizon is the union of Killing 
horizons of X are static. 

3. The Schwarzschild black holes exhaust the family 
of static regular vacuum black holes (Israel, 
Bunting — Masood-ul-Alam, Chrusciel). 

4. The Kerr black holes satisfying 


m^» a [11] 


exhaust the family of nondegenerate, stationary- 
axisymmetric, vacuum, connected black holes. 
Here m is the total Arnowitt-Deser-Misner 
(ADM) mass, while the product am is the total 
ADM angular momentum. (Of course, these 
quantities generalize the constants a and m 
appearing in the Kerr metric.) The framework 
for the proof has been set up by Carter, and the 
statement above is due to Robinson. 


The above results are collectively known under 
the name of no-hair theorems, and they have not 
provided the final answer to the problem so far. 
There are no a priori reasons known for the 
analyticity hypothesis in the rigidity theorem. 
Further, degenerate horizons have been completely 
understood in the static case only. 

Yet another key open question is that of the 
existence of nonconnected regular  stationary- 
axisymmetric vacuum black holes. The following 
result is due to Weinstein: let 0.7,, a4 — 1,..., N, be 
the connected components of 0.7. Let X’ = gy X" dx", 
where X" is the Killing vector field which asymptoti- 
cally approaches the unit normal to Sext. Similarly, set 
Y'—g,,Y"dx", Y" being the Killing vector field 
associated with rotations. On each 0.%,, there exists 
a constant w, such that the vector X + w,Y is tangent 
to the generators of the Killing horizon intersecting 
OY a. The constant wa is called the angular velocity of 
the associated Killing horizon. Define 


1 ! 
gi -—— 1 
ma=-=3- f os 112] 
- -x] ¿dy 113] 
“ 4m Jos, 
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Such integrals are called Komar integrals. One 
usually thinks of L, as the angular momentum of 
each connected component of the black hole. Set 


lia = Ma — 2w4L, [14] 


Weinstein shows that one necessarily has fta > 0. 
The problem at hand can be reduced to a harmonic- 
map equation, also known as the Ernst equation, 
involving a singular map from R? with Euclidean 
metric ó to the two-dimensional hyperbolic space. 
Let ra > 0,a=1,...,N—1, be the distance in R? 
along the axis between neighboring black holes as 
measured with respect to the (unphysical) metric ô. 
Weinstein proved that for nondegenerate regular 
black holes the inequality [11] holds, and that the 
metric on ((./7)) is determined up to isometry by the 
3N — 1 parameters 


q A AAN E TET TN-1) [15] 


just described, with 7r,,j4, > 0. These results by 
Weinstein contain the no-hair theorem of Carter 
and Robinson as a special case. Weinstein also 
shows that, for every N > 2 and for every set of 
parameters [15] with 4,7, 0, there exists a 
solution of the problem at hand. It is known that 
for some sets of parameters |15] the solutions will 
have “strut singularities” between some pairs of 
neighboring black holes, but the existence of the 
“struts” for all sets of parameters as above is not 
known, and is one of the main open problems in our 
understanding of stationary-axisymmetric electro- 
vacuum black holes. The existence and uniqueness 
results of Weinstein remain valid when strut 
singularities are allowed in the metric at the outset, 
although such solutions do not fall into the category 
of regular black holes discussed here. 
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Introduction 


An oscillatory integral is an integral of the form 


Iw) = / eine) a(9) d0 i" 


Here the integration is over a smooth k-dimensional 
manifold O which is provided with a smooth density 
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dé. The real variable w plays the role of a frequency 
variable, whereas the real-valued smooth function y on 
O is called the phase function. The amplitude function a 
is assumed to be a compactly supported complex 
(vector-) valued smooth function on O. The topic of 
this article is the asymptotic behavior of the oscillatory 
integral I(w) as the frequency w tends to infinity. 
When the manifold O is not compact and the 
amplitude function is not compactly supported, then a 
smooth cutoff function may be used to write the 
integral as the sum of an integral with a compactly 
supported amplitude and one with an amplitude which 
is equal to zero in a large compact subset of O. The 


latter integral can be studied if suitable assumptions 
are made about the asymptotic behavior of the phase 
function and the amplitude at infinity, but this is not 
the subject of this article. The use of the exponential 
function with purely imaginary argument instead of 
the sine and the cosine is just a matter of convenience. 

The first observation about oscillatory integrals in 
the next section is the principle of stationary phase, 
which states that the contributions to the integral 
which are not rapidly decreasing as w-— oo only 
come from the stationary points of y, the points 0 € O 
where the total derivative dy(@) of y is equal to zero. 
This principle is closely related to the observation that 
a superposition of waves is maximal at points where 
the waves are in phase, an observation which goes 
back to Huygens (1690). 

Assume that ĝo is a nondegenerate stationary point of 
p. That is, dy(@9) = 0 and the Hessian D*y(0) of y at 
y is nondegenerate. Then 6, is an isolated stationary 
point of y, and the contribution to [(w) of a neighbor- 
hood of ĝo has an asymptotic expansion of the form 


OQ 
I(w) ~ eo) ) C, kr roo 
r=0 


Here the leading coefficient cy is the product of a(0p) 
with a nonzero constant which only depends on 
D*¿(60) and the density d@ at 05. For increasing r the 
coefficients c, depend on the derivatives of y and a 
at 4) of increasing order (see the section “The 
method of stationary phase"). 

Usually, even if all the objects are analytic in a 
neighborhood of ĝo, the asymptotic power series 
does not converge. However, there are exceptional 
cases where the stationary phase approximation is 
exact. Assume, for instance, that O is a compact 
manifold provided with a symplectic form oc, y is the 
Hamiltonian function of a Hamiltonian circle action 
on O with isolated fixed points, and a(0) d0 = o^ /k!. 
Then the stationary points of y are the fixed points 
of the circle action, each stationary point of y is 
nondegenerate and /(w) is equal to the sum over the 
finitely many stationary points of only the leading 
terms of the asymptotic expansions at the stationary 
points. This Duistermaat-Heckman formula is a 
consequence of a more general localization formula 
in equivariant cohomology (see the section “Exact 
stationary phase"). 

For the purpose of applications, but also in the 
analysis of oscillatory integrals, it is worthwhile to 
allow complex-valued phase functions, but with a 
local minimum for the imaginary part at the 
stationary point ĝo of the real parts. That is, the 
real part of the exponent iwp(0) has a local 
maximum at 6). An extreme case occurs when 
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(0) — (0) for a real-valued function v» which has a 
nondegenerate local minimum at 605, in which case 
the integrand is a sharply peaking Gaussian density 
at fo. When y and a are analytic near 69, then the 
method of steepest descent consists of deforming the 
path of integration in the complex domain in such a 
way that the integrand becomes such a sharply 
peaking Gaussian density. During the deformation, 
the integral does not change because of Cauchy's 
integral theorem. 

An important extension of the theory occurs if the 
real-valued phase function and the amplitude are 
allowed to depend smoothly on additional para- 
meters x, which vary in an n-dimensional smooth 
manifold M. The amplitude is also allowed to 
depend on w, with an asymptotic expansion of 
the form 


a(x,0,w) ~ Y ax, gy,/n* 62-1 as w — oo [2] 


r=0 


The expansion is supposed to be locally uniformly in 
(x,0) and to allow termwise differentiations of any 
order with respect to the variables (x,0). Then the 
integral 


I(x,w) = / er a(x, 0, w) de 


is called an oscillatory integral of order m. Here the 
function x++I(x,w) is viewed as a continuous 
superposition of the 0-dependent family of oscilla- 
tory functions x — el?" a(x, 0). 

The example which formed the point of departure 
of Airy (1838) is that e^? (x, 0,w) is the wave 
which arrives at the points x in spacetime which is 
sent out by a point 0 on a reflecting mirror. That is, 
at x one collects (= integrates over O) all the waves 
sent out by the various points 0 of the mirror O. The 
main point of the theory, however, is that in great 
generality the solutions of linear partial differential 
equations, such as classical wave equations or 
quantum mechanical Schródinger equations, can be 
represented, as functions of x, as oscillatory inte- 
grals. This construction has led to decisive progress 
in the general theory of linear partial differential 
equations with smoothly varying coefficients. 

According to the principle of stationary phase, the 
main asymptotic contributions to the integral come 
from the points O such that 0p(x,0)/00=0. The 
phase function v € C*(M x 0) is called nondegene- 
rate if the (n + k) x k-matrix 


a2 ; 
9 eu has rank k when aL 


O(x. 0)00 m =" A 
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This is the natural condition to ensure that the set 


S D €Mx e - : 


is a smooth n-dimensional submanifold of M x ©. 
The condition [3], moreover, implies that the mapping 


ta: So 3 (x, b) > (x, A) cT'M 

d Ox 
is a smooth immersion from $, into the cotangent 
bundle T'M of M. Note that ¿=0p(x,0)/0x is 
coordinate invariantly defined as a linear form on 
the tangent space T,M of M at the point x. That 
is, € € (T, M)' = the dual space of T, M, and (T, M)* 
is the fiber of T'M over x. In classical mechanics, 
T'M is the phase space of the position space M, 
and a linear form € on T,M is called a momen- 
tum vector at the position x. If ø denotes the cano- 
nical symplectic form on T'M, then ;/*6 —0. The 
immersion ¿¿ locally embeds S$, onto a smooth 
n-dimensional submanifold A, of M, which is a 
Lagrangian manifold in T'M, which by definition 
means that ¿¿o=0. 

Oscillatory integrals with very different phase 
functions and amplitudes can define the same 
w-dependent functions on M. The theory of 
Hormander (1971, section 3.1) says that the germs 
of the Lagrangian manifolds A, and Ay are the same 
if and only if p and v define the same class of 
oscillatory integrals. Moreover, every Lagrangian 
submanifold A of T'M is locally of the form A, 
for some nondegenerate phase function y. In this 
way, the mapping y! A, defines a bijection 
between the set of equivalence classes of germs of 
nondegenerate phase functions and the set of germs 
of Lagrangian submanifolds of T*M. Let A be an 
immersed Lagrangian submanifold of T'M. A 
global oscillatory integral of order m on M, defined 
by A, is a locally finite sum u(x,w) of oscillatory 
integrals of order m with nondegenerate phase 
functions y such that A, C A. The leading terms of 
the amplitudes correspond to a section s of a 
canonically defined complex line bundle A over A, 
which is called the principal symbol of u (see the 
section “The principal symbol on the Lagrangian 
manifold"). 

If P is a linear partial differential operator, such as 
the wave operators, in which the coefficients may 
depend in a smooth way on x and in a polynomial 
way on w, then the condition that Pu is asymptoti- 
cally small implies that p — 0 on A, in which p is a 
smooth function on T'M, called the principal 
symbol of P. Because A is a Lagrangian manifold, 


the equation p — 0 implies that A is invariant under 
the flow of the Hamiltonian system with Hamilton 
function equal to p. Furthermore, the principal 
symbol s of 4 satisfies a homogeneous first-order 
ordinary differential equation along the solution 
curves of the Hamiltonian system. Conversely, these 
properties can be used to construct global oscillatory 
integrals u which asymptotically satisfy Pu — 0 and 
have prescribed initial values. This theory, due to 
Maslov (1972), may be viewed as a far reaching 
generalization of the WKB method. 

Let 7: T M M:(x,£)—5 x denote the canonical 
projection from T*M onto M. The projections into 
M of the solution curves in a Lagrangian submani- 
fold A of T'M, of a Hamiltonian system which 
leaves A invariant, are the ray bundles of geome- 
trical optics. If A is not transversal to the fiber of 
T" M at (x,£), then the ray bundle exhibits a caustic 
at the point x € M, and the oscillatory integral is 
asymptotically of larger order than w” near x. 
Applying the theory of unfoldings of singularities 
to the phase function, one can determine the 
structurally stable caustics and obtain normal 
forms of the oscillatory integrals in the structurally 
stable cases (see the section “Caustics”). 

If we also integrate over the frequency variable w, 
then we obtain the Fourier integral distributions u of 
Hórmander (1971, sections 1.2 and 3.2). In this case 
the corresponding Lagrangian manifold is conic in 
the sense that if (x, €) € A, then (x, T£) € A for every 
T0. The wave front set of u, which is the 
microlocal singular locus of the distribution x, is 
contained in A, with equality if the principal symbol 
of u is not equal to zero at the corresponding 
stationary points of the phase function. Fourier 
integral operators are defined as the linear operators 
acting on distributions, of which the distribution 
kernels are Fourier integral distributions. Under a 
suitable transversality condition for the Lagrangian 
manifolds of the distribution kernels, the composi- 
tion of two Fourier integral operators is again a 
Fourier integral operator, and the principal symbol 
of the composition is a product of the principal 
symbols. The proof is an application of the method 
of stationary phase. Fourier integral operators are a 
very powerful tool in the analysis of linear partial 
differential operators with smoothly varying coeffi- 
cients (see Hórmander (1985)). 


The Principle of Stationary Phase 


The principle of stationary phase says that if the 
phase function y has no stationary points in 
the support of the amplitude function a, then the 


oscillatory integral [1] is rapidly decreasing, in the 
sense that for every N we have I(w)=O(w™) as 
wu — oc. For the proof, one introduces a vector field v 
on © such that vo —1 on a neighborhood of the 
support of a. Then e'^?—(iu)vw(e^^), and an 
integration by parts in [1] yields that 


I(w) == / e¥(9) (*ya) (9) de 


lu . 


where 'v denotes the transposed of the linear partial 
differential operator v. Iterating this, the rapid 
decrease of I(w) follows. 

Using cutoff functions, I(w) is, modulo a rapidly 
decreasing function, equal to an oscillatory integral 
with phase function p and an amplitude which 
has support in an arbitrarily small neighborhood of 
the set of stationary points of y. In this sense, 
the contributions to the integral which are not 
rapidly decreasing come only from the stationary 
points of y. 


The Method of Stationary Phase 


Assume that 0, is a nondegenerate stationary point 
of y. Then ĝo is an isolated stationary point of q. 
Using local coordinates near 69, the contribution to 
[1] from the neighborhood of ĝo can be written as an 
oscillatory integral with O — R* and a pase function 
y which has a nondegenerate stationary point at 0. 
Write O — D^o(0). According to the Morse lemma, 
there is smooth substitution of variables 0 — T(y) 
such that T(0) 20, DT(0) — I, and g(T(u)) ^ (0) + 
(Oy, y)/2. for all y in a neighborhood of 0 in R*. 
Applying this substitution of variables to [1] we 
obtain 


Iw) = eise) / 0)9)/2b(y) dy 
R* 


where b is a compactly supported smooth function 
on R^ with b(0) =a(0). Now the Fourier transform 
of the function y => e'^'9»»)/? is equal to the function 


n= (der(20)) Wee qug 


Both in the definition. of the square root of the 
determinant and in the proof one uses the analytic 
continuation to the domain of complex-valued 
symmetric bilinear forms O for which the imaginary 
part of O is positive definite. For purely imaginary 
O we have the familiar formula for the Fourier 
transform of a Gaussian density (see Hórmander 
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(1990, theorem 7.6.1)). The Taylor expansion of the 
exponential factor in [4] then yields that 


/ ely p(y) dy 


as w— oo (see Hórmander (1990, lemma 7.7.3)). 

It is important for the applications that, if the 
phase function and amplitude depend smoothly on 
parameters, all the constructions can be made to 
depend smoothly on the parameters. 


Exact Stationary Phase 


Suppose that we have given an action of a Lie group 
G on the manifold O. Let q denote the Lie algebra of 
G. For any g€ G and X € q the corresponding 
diffeomorphism of O and vector field on O is 
denoted by ge and Xe, respectively. If O(O) denotes 
the algebra of smooth differential forms on ©, then 
we consider the algebra Sq* 9 Q(O) of all M(6)- 
valued polynomials on q, where Sq* denotes the 
algebra of all polynomial functions on q. On Sa* & 
(©) we have the action of g € G which sends a to 
Xeog&(o(Adg X). Let A=(Sg' 9 Q(9))" denote 
the subalgebra of all G-invariant elements of Sq* & 
(0(0). The equivariant exterior derivative D is 


defined by 
(Da)(X) = d(a(X)) — ix, (a(X)) 


If œ is homogeneous as a differential form of degree p 
and homogeneous as a polynomial on q of degree q, 
then r — p + 2q is called the total degree of a. Let A" 
denote the space of sums of such o € A of total degree r. 
Then D, = D: A” — A'*! and D, o D,_; — 0. The space 
H¿(0):= ker D,/Im D, ; is called the equivariant 
cohomology in degree r, in the model of Cartan (1950). 

Assume that O is compact and oriented, and that 
the action of G preserves the orientation. If o € A, 
then we denote by a(X)!*! the volume part of the 
differential form a(X), and 


(Ja)(x):= [| ax". X€g 


defines an Ad G-invariant function fa on g. Now 
a= DB implies that a(X)!*l id equal to the exterior 
derivative of 3(X)'^-!!, and therefore fa. =0, in view 
of Stokes’ theorem. It follows that integration over O 
yields a linear mapping f from Hc(9) to ae, 
which is called integration in equivariant cohomology. 
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Now assume that also the Lie group G is 
compact, and let X € q. Then the zero-set Zx of 
Xo in O has finitely many connected components F, 
each of which is a smooth and compact submanifold 
of ©. In general, the Fs can have different 
dimensions. The linearization LX of the vector 
field Xo along F acts linearly on the normal bundle 
NF of F. If Q is the curvature form of NF, then 


e(X) = dete (LX — o) 


27 
is called the equivariant Euler form of NF. e(X) is an 
invertible element in the algebra QO*'"*"(F). The 


localization formula of Berline-Vergne (1982) and 
Atiyah-Bott (1984) now says that if Da=0 then 


(/)00 = [ e 00/00)*^^ 


Assume that c is a symplectic form on O, which 
implies that k — 2/ is even. Furthermore, assume that 
the infinitesimal action of q on € is Hamiltonian, 
which means that there exists a G-equivariant 
smooth mapping 1:0 à*, called the momentum 
mapping, such that ixo = —d(su(X)) for every X € q. 
Here ju is viewed as an element of (q* & Q9(0))* c A. 
Then 5(X):=0 — (X) defines an element 7 € A such 
that D6 — 0. In turn, this implies that the form 


l 


eiwo(X) — eien(X) Y (—iwo) /r! 


r=0 


B(X):— 


is equivariantly closed, and the localization formula 
of equivariant cohomology applied to this case 
yields the Duistermaat-Heckman (1982, 1983) 
formula. Because 3(X)!*! = MXN C iwo)! /1t. its inte- 
gral over O is an oscillatory integral with phase 
function j(X). The stationary points of u(X) are the 
zeros of Xo and the stationary points of (X) are 
nondegenerate if and only if the zeros of Xə are 
isolated. It follows that in this case the oscillatory 
integral is equal to the leading term in the 
stationary-phase approximation. 


The Principal Symbol on the 
Lagrangian Manifold 


Let u(x,w) be a global oscillatory integral of order m 
defined by A, and let (xo, £o) € A. One way to define 
the principal symbol of u at (xo,£o) € A is to test u 
with an oscillatory function of the form e~”*) b(x), 
in which dv(xo) — £o, the support of b is contained 
in a small neighborhood of xo, and b(xo) — 1. If u is 
locally represented by the phase function y and 


amplitude a, and (xo, £0) = (,(xo, 00), then the phase 
function p(x, 0) — (x) in the oscillatory integral 


¿ip = Lie AD a(x, 0)b(x) dd dx 


has a stationary point at (xo,00), which means that 
A and dy intersect at (xo, £9). Here the 1-form dv on 
M, which is a section of 7: T' M — M, is viewed as a 
submanifold of T'M. Locally the Lagrangian sub- 
manifolds of T' M which are transversal to the fibers 
of 7: T'M— M are precisely the manifolds of the 
form dy. The stationary point of o — v is non- 
degenerate if and only if L:—T(,4,^ and 
Ly = T, g)(dv) are transversal. In this case, the 
method of stationary phase can be applied in order 
to obtain an asymptotic expansion in terms of 
powers of w. The coefficient of the leading term of 
order w” depends only on the Lagrangian plane Ly, 
which is transversal to both L and the tangent space 
of the fiber of T' M, and not on the other data of y 
and b. If £ denotes the set of all Lagrangian planes 
in Ti, «(T M) which are transversal to both L and 
the fiber, then the complex-valued functions on £ 
which arise in this way form a one-dimensional 
complex vector space Lixa) The L(&, for 
(xo,£o0) € A form a complex line bundle A over A 
which is canonically isomorphic to the tensor 
product of the line bundle of half-densities and the 
Maslov line bundle, a line bundle with structure 
group Z/4Z (see Duistermaat (1974, section 1.2)). 
In this way, the principal symbol s of u can be 
viewed as a section of the line bundle A over A. 


Caustics 


Let (xo, £o) be a point in the Lagrangian submanifold 
A of T'M. The restriction to A of the projection 
7:1 M—M is a diffeomorphism from an open 
neighborhood of (x9,€) in A onto an open neigh- 
borhood of xo in M, if and only if A is transversal 
to the fiber of T'M at (xo,£o). If A=A, for a 
nondegenerate phase function (p,(xo,€0) € Sp and 
(xo, £o) =t,(x0,90), then this condition is in turn 
equivalent to the condition that 0, is a nondegenerate 
stationary point of 0 — (xo, 0). An application of the 
method of stationary phase shows that in this case the 
oscillatory integral is equal to a progressing wave of 
the form e'^"*)b(x,u). Here w(x) = v(x,0(x)), where 
0(x) is the stationary point of 0 — p(x, 0), and b(x, w) 
has an asymptotic expansion as in [2] with & — 0. 

If 09 is a degenerate stationary point of 
0 (xo, 0) and ao(xo, 09) Æ 0, then the oscillatory 
integral is not of order O(w””). That is, it is of larger 
order than at points where we have a nondegenerate 
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stationary point. For this reason, the points (xo, £o) 
at which A is not transversal to the fibers of 
m:T*M—M are called the caustic points of A. 
Their projections xy € M form the caustic set in M. 

In the theory of unfoldings of singularities, the 
germs of the families of functions x — (0 => (x, 0)) 
and y — (uh v(y, 1)) are called equivalent if there 
exists a germ of a diffeomorphism of the form 
H : (x,0) => (y(x), u(x, 0)) and a smooth function x(x) 
such that v(y(x), u(x, 0)) = p(x, 0) + x(x). If J(y,w) is 
an oscillatory integral with phase function 4, 
integration variable y and parameter y, then the 
substitution of variables p= (x, 0) in the integral, 
followed by the substitution of variables y — y(x) in 
the parameters, yields that J(y, w) — e?) [(x, uj), in 
which I(x,w) is an oscillatory integral with phase 
function y and an amplitude function of the same 
order as the amplitude function of J. The germ q is 
called stable if every nearby germ y is equivalent to 
y. The Morse lemma with parameters implies that 
this is the case if 9 => p(xo0,0) has a nondegenerate 
stationary point at 07. However, the theory of 
unfoldings of singularities of Thom and Mather 
shows that there are many stable germs with 
degenerate critical points. Moreover, in dimension 
n < 5 the generic germ is stable, and is equivalent to 
a germ in a finite list of normal forms. 

The simplest example of a normal form with 
degenerate critical points is (x, 0) — 0? + x10. Here 
we have taken k=1, but still allowed an arbitrary 
dimension n > 1 of M. In this normal form, the 
stationary points correspond to 36? + x; — 0, which 
is a manifold which over the x-space folds over at 
xı — 0. The stationary point is degenerate if and only 
if 60 — 0, hence x; =0, which means that x; =0 is 
the caustic set. If the amplitude is equal to 1, then 
the oscillatory integral is equal to w^? Ai(w*/?x;), 
in which Ai(z) denotes the Airy function. If the 
amplitude is nonzero at a degenerate critical point, 
then the oscillatory integral near the corresponding 
caustic point is asymptotically of the same order as 
wu? Ai(w*/3x1), which implies that the oscillatory 
integral is a factor w!/* larger at these caustic points 
than at the points away from the caustic set. In Airy 
(1838), where the Airy function was introduced, Airy 
considered light in a neighborhood of a caustic as an 
oscillatory integral. Then, under suitable genericity 
conditions, he brought the phase function into the 
normal form 6? + x10. Even for stable normal forms 
in low dimensions, the interference patterns near the 


caustic points can be very intricate (see, e.g., Berry 
et al. (1979)). A survey of the application of the 
theory of unfoldings to caustics in oscillatory 
integrals can be found in Duistermaat (1974). 


See also: Equivariant Cohomology and the Cartan 
Model; Feynman Path Integrals; Functional Integration in 
Quantum Physics; Hamiltonian Group Actions; 
h-Pseudodifferential Operators and Applications; 
Multiscale Approaches; Normal Forms and Semiclassical 
Approximation; Optical Caustics; Path Integrals in 
Noncommutative Geometry; Perturbation Theory and its 
Techniques; Schródinger Operators; Singularity and 
Bifurcation Theory; Wave Equations and Diffraction. 
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Introduction 


Equilibrium statistical mechanics and combinatorial 
optimization — which is viewed here as a branch of 
discrete mathematics and theoretical computer 
science — have common roots. Phase transition are 
mathematical phenomena which are not limited to 
physical systems but are typical of many combina- 
torial problems, one famous example being the 
percolation transition in random graphs. Similarly, 
the understanding of relevant physical problems, 
such as three-dimensional lattice statistics or two- 
dimensional quantum statistical mechanics pro- 
blems, is strictly related to the question of purely 
combinatorial origin of solving counting problems 
over nonplanar lattices. Most of the tools and 
concepts which have allowed to solve problems in 
one field have a natural counterpart in the other. 
While the possibility of solving exactly physical 
models is always connected to the presence of some 
algebraic properties which guarantee integrability, in 
the combinatorial approach the emphasis is more on 
algorithms that can be applied to problem instances 
in which the symmetries behind intergrability might 
be absent. Also at the level of out-of-equilibrium 
phenomena, there exists a deep connection between 
physics and combinatorics: just like physical pro- 
cesses, local algorithms have to deal with an 
exponentially large set of possible configurations 
and their out-of-equilibrium analysis constitutes a 
theory of how problems are actually solved. 
Computational complexity theory deals with 
classifying problems in terms of the computational 
resources, typically time, required for their solution. 
What can be measured (or computed) is the time 
that a particular algorithm uses to solve the 
problem. This time in turn depends on the imple- 
mentation of the algorithm as well as on the 
computer the program is running on. The theory of 
computational complexity provides us with a notion 
of complexity that is largely independent of imple- 
mentational details and the computer at hand. This 
is not surprising, since it is related to a highly 
nontrivial question, that is: what do we mean by 
saying that a combinatorial problem is solvable? 
Problems which can be solved in polynomial time 
are considered to be tractable and compose the so- 
called polynomial (P) class. The harder problems are 


grouped in a larger class called NP, where NP stands 
for "nondeterministic polynomial time." These 
problems are such that a potential solution can be 
checked rapidly in polynomial time, while finding a 
solution may require exponential time in the worst 
case. In turn, the hardest problems in NP belong to a 
subclass called NP-complete: an efficient algorithm 
for solving one NP-complete problem could be 
easily modified to effectively solve any problem in 
NP. By now, a huge number of NP-complete 
problems has been identified, and the lack of such 
an algorithm corroborates the widespread conjec- 
ture P Z NP, that is, that no such algorithm exists. 
However, NP-complete problems are not always 
hard: when their resolution complexity is measured 
with respect to some underlying probability distri- 
bution of problem instances, NP-complete problems 
are often easy to solve on average. To deepen the 
understanding of the average-case complexity (and 
of the huge variability of running times observed in 
numerical experiments), computer scientists, mathe- 
maticians, and physicists have focused their atten- 
tion on the study of random instances of hard 
combinatorial problems, seeking for a link between 
the onset of exponential-time complexity and some 
intrinsic (i.e., algorithm independent) properties of 
the randomized NP-complete problems. These types 
of questions have merged combinatorial optimiza- 
tion with statistical physics of disordered systems. 

Computational complexity theory can also be 
formulated for counting problems: similarly to 
optimization problems, equivalence classes can be 
defined which separate polynomially solvable count- 
ing problems with the hard ones — the so-called #P 
and #P-complete problems. Complexity theory for 
counting problems makes the connections with 
statistical mechanics even more direct in that 
counting solutions is nothing but a computation of 
a partition function. 

Two simple theorems by Jerrum and Sinclair 
(1989) (which can be easily extended to many 
combinatorial problems) can help in clarifying 
these connections. 

The first theorem tells us that any randomized 
algorithm (e.g., Monte Carlo) for approximating the 
partition function of a generic spin glass model - the 
so-called spin glass problem — could be used to solve 
all the other NP combinatorial problems. The 
second theorem tells us that an algorithm for 
evaluating exactly the partition function of the 
ferromagnetic Ising model over a general graph 
would again solve any other problem in the class #P, 
which, as mentioned above, is the generalization of 
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the NP class to counting problems and obviously 
contains the class NP as a particular case. 

Let us consider the following sightly simplified 
definition of the Ising and the spin glass problems. 


Problem instance A symmetric matrix J; with 
entries in {—1,0,1} and an inverse temperature f. 


Output The partition function Z=}; J-LO, 
where H(o) = —» Ji 050; with o; = +1. 


Moreover, let us define the fully polynomial 
randomized approximation scheme (FPRAS) for 
counting and decision problems. A FPRAS for a 
function f from problem instances to real numbers is 
a probabilistic algorithm that in polynomial time in 
the problem size n and in the relative error e € [0, 1], 
outputs with high probability a number which 
approximates f(n) within a ratio 1+e. Given the 
above definitions, the theorems can be stated as 
follows: 


Theorem 1 There can be no FPRAS for the spin 
glass problem unless P = NP, that is, all problems in 
NP turn out to be solvable in polynomial time. 


Theorem 2 The Ising problem is #P-complete even 
when the matrix Ji is non-negative, that is, an 
algorithm which outputs in polynomial time the 
exact Ising partition function for an arbitrary graph 
could be used to solve any other counting problem 
in #P. 


The above theorems hold for arbitrary graphs, 
in particular for those graph or lattice realizations 
which are particularly hard to analyze, the so-called 
worst cases. There exist no similar proofs of 
computational hardness for more restricted and 
realistic structures, such as, for instance, three- 
dimensional regular lattices for the Ising problem 
or finite connectivity random graphs for spin glasses. 

As a final introductory remark, it is worth 
mentioning that the connections between worst- 
case complexity and the average case one is the 
building block of modern cryptography and com- 
munication theory. On the one hand, the so-called 
RSA cryptosystem is based on factoring large 
integers, a problem which is believed to be hard on 
average while it is not known to be so in the worst 
case. On the other hand, alternative cryptographic 
systems have been proposed which rely on a worst- 
case/average-case equivalence (see, e.g., the theorem 
of Ajtai (1996) concerning some hidden vector 
problems in high-dimensional lattices.) 

As far as communication theory is concerned, 
average-case complexity is indeed crucial: while 
Shannon’s theorem (1948) provides a very general 
result stating that many optimal codes do exist (in 


fact, random codes are optimal), the decoding 
problem is in general NP-complete and therefore 
potentially intractable. However, since the choice of 
the coding scheme is part of the design, what 
matters are the average-case behavior of the decod- 
ing algorithm (and its large deviations) and very 
efficient codes which can solve on average the 
decoding problem close to Shannon’s bounds are 
known. 

In what follows, we will limit the discussion to 
two basic examples of combinatorial and counting 
problems which are representative and central to 
both computer science and statistical physics. 


Constraint Satisfaction Problems 


Combinatorial problems are usually written as 
constraint satisfaction problems (CSPs): n discrete 
variables are given which have to satisfy m 
constraints, all at the same time. Each constraint 
can take different forms depending on the prob- 
lem under study: famous examples are the 
K-satisfiability (K-SAT) problem in which constraints 
are an “OR” function of K variables in the ensemble 
(or their negations) and the graph O-coloring 
problem in which constraints simply enforce the 
condition that the endpoints of the edges in the graph 
must not have the same color (among the O possible 
ones). Quite in general a generic CSP can be written 
as the problem of finding a zero-energy ground state 
of an appropriate energy function and its analysis 
amounts at performing a zero-temperature statistical 
physics study. Hard combinatorial problems are 
those which correspond to frustrated physical model 
systems. 

Given an instance of a CSP, one wants to know 
whether there exists a solution, that is, an assign- 
ment of the variables which satisfies. all the 
constraints (e.g., a proper coloring). When it exists, 
the instance is called SAT, and one wants to find 
a solution. Most of the interesting CSPs are 
NP-complete: in the worst case, the number of 
operations needed to decide whether an instance 
is SAT or not is expected to grow exponentially 
with the number of variables. But recent years 
have seen an upsurge of interest in the theory of 
typical-case complexity, where one tries to identify 
random ensembles of CSPs which are hard to solve, 
and the reason for this difficulty. As already 
mentioned, random ensembles of CSPs are also 
of great theoretical and practical importance in 
communication theory, since some of the best 
modern error-correcting codes (the so-called low- 
density parity check codes) are based on such 
constructions. 
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Satisfiability and Spin Glass Models 


The archetypical example of CSP is satisfiability 
(SAT). This is a core problem in computational 
complexity: it is the first one to have been shown 
NP-complete, and since then thousands of problems 
have been shown to be computationally equivalent 
to it. Yet it is not so easy to find difficult instances. 
The main ensemble which has been used for this 
goal is the random K-SAT ensemble (for K » 2, 
K-SAT is NP-complete). 

The SAT problem is defined as follows. Given a 
vector of {0,1} Boolean variables x = (x;];.;, where 


I — (1,...,n), consider a SAT formula defined by 
x) = A Ca(x) 
acA 


where A is an arbitrary finite set (disjoint with I) 
labeling the clauses Ca; Calx)= Viera, Ja,i(xi)3 any 
literal J, ¡(x;) is either x; or ~x; (“not” x;); and 
finally, I(a) C I for every a € A. Similarly to I(a), we 
can define the set A(i) C A as A(i) = {a:i € I(a)), that 
is, the set of clauses containing variable x; or its 
negation. 

Given a formula F, the problem of finding a 
variable assignment s such that F(s)= 1, if it exists, 
can also be written as a spin glass problem 
as follows: if we consider a set of n Ising spins, 
o; € {+1} in place of the Boolean variables 
(c;— —1,1<+x;=0,1) we may write the energy 
function associated to each clause as follows: 


E, = [Uo T i, 7i,) 


r=] 


where Ja; = —1 (resp. /,,; — 1) if x; (resp. x;) appears 
in ean : The total energy of a configuration 
E= Y E, is nothing but a K-spin spin glass 
model. 

Random K-SAT is a version of SAT in which each 
clause is taken to involve exactly K distinct 
variables, randomly chosen and negated with uni- 
form distribution. Its energy function corresponds to 
a spin glass system over a finite connectivity 
(diluted) random graph. 

In recent years random K-SAT has attracted much 
interest in computer science and in statistical 
physics. The interesting limit is the thermodynamic 
limit 27 — o6, m=|A|— oo at fixed clause density 
a=m/n. 

Its most striking feature is certainly its sharp 
threshold. It is strongly believed that there exists a 
phase transition for this problem: numerical and 


heuristic analytical arguments are in support of the 
so-called satisfiability threshold conjecture: 


Conjecture There exists a(K) such that with high 
probability: 


e if o <a¿(K), a random instance is satisfiable; 
e if a > a.(K), a random instance is unsatisfiable. 


Although this conjecture remains unproven, the 
existence of a nonuniform sharp threshold has been 
established by Friedgut (1997). A lot of effort has been 
devoted to understanding this phase transition. This is 
interesting both from physics and the computer science 
points of view, because the random instances with a 
close to o, are the hardest to solve. There exist 
rigorous results that give bounds for the threshold 
ac(K): using these bounds, it was shown that a,(K) 
scales as 2^ In (2) when K — oc. 

On the statistical physics side, the cavity method 
(which is the generalization to disordered systems 
characterized by ergodicity breaking of the iterative 
method used to solve exactly physical models on the 
Bethe lattice), is a powerful tool which is claimed to 
be able to compute the exact value of the threshold, 
giving for instance a,(3) ~ 4.2667... It is a non- 
rigorous method but the self-consistency of its 
results have been checked by a “stability analysis,” 
and it has also led to the development of a new 
family of algorithms — the so-called “survey propa- 
gation” — which can solve efficiently very large 
instances at clause densities which are very close to 
the threshold (for technical details see Mézard and 
Zecchina (2002) and Braunstein et al. (2005) and 
references therein). 

The main hypothesis on which the cavity analysis 
of random K-SAT relies is the existence, in a region 
of clause density [ay,a_] close to the threshold, of 
an intermediate phase called the “hard-SAT” phase; 
see Figure 1. In this phase the set S of solutions 
(a subset of the vertices in an n-dimensional 
hypercube) is supposed to split into many discon- 
nected clusters S=S; US; U--.. If one considers 
two solutions X, Y in the same cluster Sj, it is 


expe 
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Figure 1 A pictorial representation of the clustering transition 
in random K-SAT. 
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possible to walk from X to Y (staying in S) by 
flipping at each step a finite numbers of variables. If, 
on the other hand, X and Y are in different clusters, 
in order to walk from X to Y (staying in S), at least 
one step will involve an extensive number (i.e., xn) 
of flips. This clustered phase is held responsible for 
entrapping many local-search algorithms into non- 
optimal metastable states. This phenomenon is not 
exclusive to random K-SAT. It is also predicted to 
appear in many other hard SAT and optimization 
problems such as *coloring," and corresponds to the 
so called “one-step replica symmetry breaking" 
(1RSB) phase in the language of statistical physics. 
It is also a crucial limiting feature for decoding 
algorithms in some error correcting codes. 

The only CSP for which the existence of the 
clustering phase has been established rigorously is 
the polynomial problem of solving random linear 
equation in GF (Motwani and Raghavan 2000). For 
random K-SAT, rigorous probabilistic bounds can 
be used to prove the existence of the clustering 
phenomenon, for large enough K, in some region of 
o included in the interval [a4(K), o. (K)] predicted by 
the statistical physics analysis. 

In the analysis of CSP like K-SAT, two main 
questions are in order. The first is of algorithmic 
nature and asks for an algorithm which decides 
whether for a given CSP instance all the constraints 
can be simultaneously satisfied or not. The second 
question is more theoretical and deals with large 
random instances, for which one wants to know the 
structure of the solution space and predict the 
typical behavior of classes of algorithms. 


Message-Passing Algorithms from 
Statistical Physics 


The algorithmic contributions of statistical 
mechanics to combinatorial optimization are numer- 
ous and important (a representative example being 
the celebrated “simulated annealing algorithm"). 
For the sake of brevity, here we limit the discussion 
to the so-called “message-passing algorithms” which 
are also of great interest in coding theory. 

The statistical analysis of the cavity equations 
allows to study the average properties of ensemble 
of problems and it is totally equivalent to the replica 
method in which the average over the ensemble is 
the first step in any calculation. The survey 
propagation (SP) equations are a formulation of 
the cavity equations which is valid for each specific 
instance and is able to provide information about 
the statistical behavior of the individual variables in 
the stable and metastable states of a given energy 
density (i.e., given fraction of violated constraints). 


The single-sample SP equations are nicely described 
in terms of the factor graph representation used in 
information theory to characterize error-correcting 
codes. In the factor graph, the N variables i,/,k,... are 
represented by circular “variable nodes," whereas the 
M clauses a, b, c,... are represented by square “func- 
tion nodes." For random K-SAT, the function nodes 
have connectivity K, while the variable nodes have an 
average Poisson connectivity Ka. 

The iterative SP equations are examples of message- 
passing procedures. In message-passing algorithms 
such as the so-called “belief propagation (BP) 
algorithm" used in error-correcting codes and 
statistical inference problems, the unknowns which 
are self-consistently evaluated by iteration are the 
marginals over the solution space of the variables 
characterizing the combinatorial problem (the prob- 
ability space is the set of all solutions sampled with 
uniform measure). According to the physical inter- 
pretation, the quantities that are evaluated by SP are 
the probability distributions of local fields over the set 
of clusters. That is, while BP performs a “white” 
average over solutions, SP takes care of cluster-to- 
cluster fluctuations, telling us which is the probability 
of picking up a cluster at random and finding a given 
variable biased in a certain direction (or unfrozen if it 
is paramagnetic in the cluster). SP computes quantities 
which are probabilities over different pure states: the 
order parameter which is evaluated as fixed point of 
the SP equations is a probability measure in a space of 
functions, or for finite n, the full list of probability 
densities describing the cluster-to-cluster fluctuations 
of the variables. 

In both SP and BP one assumes knowledge of the 
marginals of all variables in the temporary absence 
of one of them and then writes the marginal 
probability induced on this “cavity” variable in 
absence of another third variable interacting with it 
(i.e., the so-called Bethe lattice approximation for 
the problem). These relations define a closed set of 
equations for such cavity marginals that can be 
solved iteratively (this fact is known as message- 
passing technique). The equations become exact if 
the cavity variables acting as inputs are uncorre- 
lated. They are conjectured to be an asymptotically 
exact approximation over random locally tree-like 
structures such as, for instance, the random K-SAT 
factor graph. Both BP and SP can be derived in a 
variational framework. 


Complexity of Counting Problems 


In order to describe the nature of computational 
complexity of counting in physical models, it is 
enough to consider the classical Ising problem. The 
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computation of the Ising partition function or, more 
in general, of the weighted matching polynomial, is 
the root problem of lattice statistics. 

For planar graphs like, for example, two-dimensional 
regular lattices, counting problems can often be 
solved by a variety of different methods, for 
example, transfer matrices and Pfaffians, which 
require a number of operations which are poly- 
nomial in the number of vertices. 

The complexity of the counting problems changes 
if one considers nonplanar graphs, that is, graphs 
with a nontrivial topological genus. In discrete 
mathematics, such problems are classified as 
#P-complete, meaning that the existence of an 
exact polynomial algorithm for the evaluation of the 
generating functions would imply the polynomial 
solvability of many known counting combinatorial 
problems, the most famous one being the evaluation of 
the permanent of 0-1 matrices. In statistical mechanics 
and mathematical chemistry, the interest in nonplanar 
lattices is obviously related to their D > 2 character: 
the three-dimensional cubic lattice is nothing but a 
nonplanar graph of topological genus g= 1 + N/4, 
where N is the number of sites. 

The planar two-dimensional Ising model was solved 
in 1944 by Onsager using the algebraic transfer matrix 
method. Successively, alternative exact solutions have 
been proposed which resorted to simple combinatorial 
and geometrical reasoning. As is well known, the 
underlying idea of the combinatorial methods consists 
in recasting the sum over spin configurations of the 
Boltzmann weights as a sum over closed curves (loops) 
weighted by the activity of their bonds. Double 
counting is avoided by a proper cancellation mechan- 
ism which takes care of the different intrinsic 
topologies of loops which give rise to the same 
contribution in the partition function. Such an 
approach has been developed first by Kac and Ward 
(1952) and provides a direct way of taking the field 
theoretic continuum limit. In D > 2, the general- 
ization of the above method encounters enormous 
difficulties due to the variety of intrinsic topologies of 
surfaces immersed in D > 2 lattices. 

Another combinatorial method proposed in the 
1960s by Kasteleyn is the so-called Pfaffian method. 
It consists in writing the weighted sum over loops as 
a dimer covering or prefect matching generating 
function. Once the relationship between loop count- 
ing and dimer coverings (or perfect matchings) over 
a suitably decorated and properly oriented lattice is 
established, the Pfaffian method turns out to be a 
simple technique for the derivation of exact solu- 
tions or for the definition of polynomial algorithms 
over planar lattices which are applicable also to the 
two-dimensional Ising spin glass. 


The generalization of the Pfaffian construction to 
the nonplanar case must deal with the ambiguity of 
orienting the homology cycles of the graph. Such a 
problem can be formally solved in full generality for 
any orientable lattice and leads to an expression of 
the Ising partition function or the dimer coverings 
generating function given as a sum over all possible 
inequivalent orientations of the lattice (or its embed- 
ding surface): for a graph of genus g, the homology 
basis is composed of 2g cycles and, therefore, there 
are 2% inequivalent orientations. It is only for graphs 
of logarithmic genus that the generalized Pfaffian 
formalism provides a polynomial algorithm. 

Counting perfect matchings can be thought of as 
the problem of evaluating the permanent of 0-1 
matrices over properly constructed bipartite graphs, 
which is among the oldest and most famous 
#P-complete problems. 

The Pfaffian formalism when applied to the perma- 
nent problem leads to a simple general result, that is, it 
provides a general formula for writing the permanent 
of a matrix in terms of a number of determinants which 
is exponential in the genus of the underlying graph. 


See also: Combinatorics: Overview; Determinantal 
Random Fields; Dimer Problems; Phase Transitions in 
Continuous Systems; Spin Glasses; Two-Dimensional 
Ising Model. 


Further Reading 


Achlioptas D, Naor A, and Peres Y (2005) Rigorous location of 
phase transitions in hard optimization problems. Nature 435: 
739—764. 

Ajtai N (1996) Generating hard instances of lattice problems. 
Electronic Colloquium on Computational Complexity 
(ECCO) 7: 3. 

Braunstein A, Mézard M, and Zecchina R (2005) Survey 
Propagation: an Algorithm for Satisfiability. Random Struc- 
tures and Algoritbms 27: 201-226. 

Cocco S and Monasson R (2004) Heuristic average-case analysis 
of the backtrack resolution of random  3-satisfiability 
instances. Theoretical Computer Science A 320: 345. 

Distler J (1992) A note on the 3D Ising model as a string theory. 
Nuclear Physics B 388: 648. 

Dubois O, Monasson R, Selman B, and Zecchina R (eds.) (2001) 
NP-hardness and Phase transitions (Special Issue), Theoretical 
Computer Science, vol. 265 (1-2). Elsevier. 

Friedgut E (1999) Sharp threshold of graph properties, and the 
KSat problem. Journal of American Mathematical Society 12: 
1017-1054. 

Jerrum M and Sinclair A (1989) Approximating the permanent. 
SIAM Journal on Computing 18: 1149. 

Lovász L and Plummer MD (1986) Matching Theory. North- 
Holland Mathematics Studies 121, Annals of Discrete Matbe- 
matics (29). New York: North-Holland. 

Mézard M, Mora T, and Zecchina R (2005) Clustering of 
solutions in the random satisfiability problem. Physics Review 
Letters 94: 197205. 


Statistical Mechanics of Interfaces 55 


Mézard M and Parisi G (2001) The Bethe lattice spin glass 
revisited. European Physical Journal B 20: 217. 

Mézard M, Parisi G, and Virasoro MA (1987) Sping Glass 
Theory and Beyond. Singapore: World Scientific. 

Mézard M, Parisi G, and Zecchina R (2002) Analytic and 
algorithmic solution of random satisfiability problems. Science 
297: 812-815. 

Mézard M and Zecchina R (2002) Random K-satisfiability: from 
an analytic solution to a new efficient algorithm. Physical 
Review E 66: 056126. 

Monasson R, Zecchina R, Kirkpatrick S, Selman B, and 
Troyansky L (1999) Determining computational complexity 
from characteristic *phase transitions". Nature 400: 133. 


Motwani R and Raghavan P (2000) Randomized Algorithms. 
Cambridge: Cambridge University Press. 

Nishimori H (2001) Statistical Physics of Spin Glasses and 
Information Processing. Oxford: Oxford University Press. 
Papadimitriou CH (1994) Computational Complexity. Addison- 

Wesley. 

Regge T and Zecchina R (2000) Combinatorial and topological 
approach to the 3D Ising model. Journal of Physics A: 
Mathematical and General 33: 741. 

Richardson T and Urbanke R (2001) An introduction to the analysis 
of iterative coding systems. In: Marcus B and Rosenthal J (eds.) 
Codes, Systems, and Graphical Models. New York: Springer. 


| Statistical Mechanics of Interfaces 


à S Miracle-Solé, Centre de Physique Théorique, 
_ CNRS, Marseille, France 


1 © 2006 Elsevier Ltd. All rights reserved. 


Introduction 


When a fluid is in contact with another fluid, or 
with a gas, a portion of the total free energy of the 
system is proportional to the area of the surface of 
contact, and to a coefficient, the surface tension, 
which is specific for each pair of substances. 
Equilibrium will accordingly be obtained when the 
free energy of the surfaces in contact is a minimum. 

Suppose that we have a drop of some fluid, b, 
over a flat substrate, w, while both are exposed to 
air, a. We have then three different surfaces of 
contact, and the total free energy of the system 
consists of three parts, associated to these three 
surfaces. A drop of fluid b will exist provided its 
own two surface tensions exceed the surface tension 
between the substrate w and the air, that is, 
provided that 


Tub rs ba > wa 


If equality is attained, then a film of fluid b is 
formed, a situation which is known as perfect, or 
complete wetting (see Figure 1). 

When one of the substances involved is aniso- 
tropic, such as a crystal, the contribution to the total 
free energy of each element of area depends on its 
orientation. The minimum surface free energy for a 
given volume then determines the ideal form of the 
crystal in equilibrium. 

It is only in recent times that equilibrium crystals 
have been produced in the laboratory, first, in 
negative crystals (vapor bubbles) of organic sub- 
stances. Most crystals grow under nonequilibrium 


a 
Ww 

a 
b 
Ww 


Figure 1 Partial and complete wetting. 


conditions and it is a subsequent relaxation of the 
macroscopic crystal that restores the equilibrium. 

An interesting phenomenon that can be observed 
on these crystals is the roughening transition, 
characterized by the disappearance of the facets of 
a given orientation, when the temperature attains a 
certain particular value. The best observations have 
been made on helium crystals, in equilibrium with 
superfluid helium, since the transport of matter and 
heat is then extremely fast. Crystals grow to sizes of 
1-5 mm and relaxation times vary from milliseconds 
to minutes. Roughening transitions for three differ- 
ent types of facets have been observed (see, e.g., 
Wolf et al. (1983)). 

These are some classical examples among a 
variety of interesting phenomena connected with 
the behavior of the interface between two phases in 
a physical system. The study of the nature and 
properties of the interfaces, at least for some simple 
systems in statistical mechanics, is also an interesting 
subject of mathematical physics. Some aspects of 
this study will be discussed in the present article. 

We assume that the interatomic forces can be 
modeled by a lattice gas, and consider, as a simple 
example, the ferromagnetic Ising model. In a typical 
two-phase equilibrium state, there is a dense 
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component, which can be interpreted as a solid or 
liquid phase, and a dilute phase, which can be 
interpreted as the vapor phase. Considering certain 
particular cases of such situations, we first introduce 
a precise definition of the surface tension and then 
proceed on the mathematical analysis of some 
preliminary properties of the corresponding inter- 
faces. The next topic concerns the wetting properties 
of the system, and the final section is devoted to the 
associated equilibrium crystal. 


Pure Phases and Surface Tension 


The Ising model is defined on the cubic lattice £ = Z^, 
with configuration space Q= (—1, 1]^. If o € Q, the 
value o(i)=—1 or 1 is the spin at the site 
i = (i1, i2,i3) € L, and corresponds to an empty or an 
occupied site in the lattice gas version of the model. 
The system is first considered in a finite box A C £, 
with fixed values of the spins outside. 

In order to simplify the exposition, we shall 
mainly consider the three-dimensional Ising model, 
though some of the results to be discussed hold in 
any dimension d > 2. We shall also, sometimes, 
refer to the two-dimensional model, it being under- 
stood that the definitions have been adapted in the 
obvious way. We assume that the box A is a 
parallelepiped, centered at the origin of £, of sides 
L1, L2, L3, parallel to the axes. 

A configuration of spins on A(a(i),7 € A), denoted 
74, has an energy defined by the Hamiltonian 


JŠ o [1] 


li DNA 


Ha(oql|o) = 


where / is a positive constant (ferromagnetic or 
attractive interaction). The sum runs over all 
nearest-neighbor pairs (7,7) C £, such that at least 
one of the sites belongs to A, and one takes 
o(i)=a(i) when i£ A, the configuration EQ 
being the given boundary condition. The probability 
of the configuration o,, at the inverse temperature 
B — 1/RT, is given by the Gibbs measure 


—Z'(A)'exp(-8Ha(sale)) [2] 
where Z^(A) is the partition function 


= S exp(—BHaloalo)) [3] 


Halol) 


Local properties at equilibrium can be described by 
the correlation functions between the spins on finite 


sets of sites, 
=2 palorlo) | | «G [4] 


ICA 


The measures [2] determine (by the Dobrushin- 
Lanford-Ruelle equations) the set of Gibbs states of 
the infinite system, as measures on the set (2 of all 
configurations. If a Gibbs state happens to be equal 
to lim a(g), when Li, L2, L3 — oo, under a fixed 
boundary condition o, we shall call it the Gibbs 
state associated to the boundary condition o. One 
also says that this state exists in the thermodynamic 
limit. Then, equivalently, the correlation functions 
[4] converge to the corresponding expectation values 
in this state. 

This model presents, at low temperatures (i.e., for 
B > Be, where 5. is the critical inverse temperature), 
two different thermodynamic pure phases, a dense 
and a dilute phase in the lattice gas language (called 
here the positive and the negative phase). This 
means two extremal translation-invariant Gibbs 
states, u^ and w`, obtained as the Gibbs states 
associated with the boundary conditions 6, respec- 
tively equal to the ground configurations a(i)=1 
and a(i)=—1, for all ¡€ £. The spontaneous 
magnetization 


m'(B) = u'(a(i)) = =w (e) [5] 


is then strictly positive. On the other hand, if 8 < Be, 
then the Gibbs state is unique and m* = 0. 

Each configuration inside A can be described in a 
geometric way by specifying the set of Peierls 
contours which indicate the boundaries between 
the regions of spin 1 and the regions of spin —1. 
Unit-square faces are placed midway between the 
pairs of nearest-neighbor sites i and j, perpendicu- 
larly to these bonds, whenever o(i)o(j) - — 1. The 
connected components of this set of faces are the 
Peierls contours. Under the boundary conditions (+) 
and (—), the contours form a set of closed surfaces. 
They describe the defects of the considered config- 
uration with respect to the ground states of the 
system (the constant configurations 1 and —1), and 
are a basic tool for the investigation of the model at 
low temperatures. 

In order to study the interface between the two 
pure phases, one needs to construct a state describ- 
ing the coexistence of these phases. This can be done 
by means of a new boundary condition. Let 
n-—(nj,nj,n3) be a unit vector in R?, such that 
nz > 0, and introduce the mixed boundary condition 
(+,n), for which 


at) = = 


This boundary condition forces the system to 
produce a defect going transversally through the 
box A, a large Peierls contour that can be 


ifi-n>0 
ifi-n<o0 6] 


interpreted as the microscopic interface (also called 
a domain wall). The other defects that appear above 
and below the interface can be described by closed 
contours inside the pure phases. 

The free energy per unit area due to the presence 
of the interface is the surface tension. It is defined by 


n3 ZP CA) 


T(n) = lim lim “ZA 


L4 ,La—00 L3—0o 


[7] 


In this expression the volume contributions propor- 
tional to the free energy of the coexisting phases, as 
well as the boundary effects, cancel, and only the 
contributions to the free energy due to the interface 
are left. The existence of such a quantity indicates 
that the macroscopic interface, separating the 
regions occupied by the pure phases in a large 
volume A, has a microscopic thickness and can 
therefore be regarded as a surface in a thermo- 
dynamic approach. 


Theorem 1 The interfacial free energy per unit 
area, T(n), exists, is bounded, and its extension by 
positive homogeneity, f (x) = |x| r(x/|x|), is a convex 
function on RÓ. Moreover, t(n) is strictly positive 
for B> Be, and vanishes if B < Be. 


The existence of 7(m) and also the last statement 
were proved by Lebowitz and Pfister (1981), in the 
particular case n = (0, 0, 1), with the help of correla- 
tion inequalities. A complete proof of the theorem 
was given later with similar arguments. The con- 
vexity of f is equivalent to the fact that the surface 
tension 7 satisfies a thermodynamic stability condi- 
tion known as the pyramidal inequality (see 
Messager et al. (1992)). 


Gibbs States and Interfaces 


In this section we consider the (+,nọo) boundary 
condition, also simply denoted (+), associated to the 
vertical direction no = (0, 0, 1), 


a(i)=1ifiz;>0, o(i--1ifi «0 [8] 


The corresponding surface tension is 7^ =T(mo). We 
shall first recall some classical results which concern 
the Gibbs states and interfaces at low temperatures. 

According to the geometrical description of the 
configurations introduced in the last section, we 
observe that 


Z*"(A)/Z' (A) = 5 'exp(-28]|4] — UAQ)) [9] 
A 


where the sum runs over all microscopic interfaces A 
compatible with the boundary condition and |A| is 
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the number of faces of A (inside A). The term U4(A) 
equals —In Z*(A, A)/Z*(A), the sum in the partition 
function Z*(\, A) being extended to all configura- 
tions whose associated contours do not intersect A. 
Each term in sum [9] gives a weight proportional to 
the probability of the corresponding microscopic 
interface. 

At low (positive) temperatures, we expect the 
microscopic interface corresponding to this bound- 
ary condition, which at zero temperature coincides 
with the plane i3 — —1/2, to be modified by small 
deformations. Each microscopic interface A can then 
be described by its defects, with respect to the 
interface at 9 = oo. To this end, one introduces some 
objects, called walls, which form the boundaries 
between the horizontal plane portions of the micro- 
scopic interface, also called the ceilings of the 
interface. 

More precisely, one says that a face of A is a 
ceiling face if it is horizontal and such that 
the vertical line passing through its center does not 
have other intersections with A. Otherwise, one 
says that it is a wall face. The set of wall faces splits 
into maximal connected components. The set of 
walls, associated to A, is the set of these compo- 
nents, each component being identified by its 
geometric form and its projection on the plane 
i;— —1/2. Every wall w, with projection T(w), 
increases the energy of the interface by a quantity 
2] \\w||, where |lwl| — |w| — |z(w)|, and two walls are 
compatible if their projections do not intersect. In 
this way, the microscopic interfaces may be inter- 
preted as a “gas of walls” on the two-dimensional 
lattice. 

Dobrushin, who developed the above analysis, 
also proved the dilute character of this *gas" at low 
temperatures. This implies that the microscopic 
interface is essentially flat, or rigid. One can under- 
stand this fact by noticing first that the probability 
of a wall is less than exp (—2/5]||w|| and, second, 
that in order to create a ceiling in A, which is not in 
the plane iz = —1/2, one needs to surround it by a 
wall, that one has to grow when the ceiling is made 
over a larger area. 

Using correlation inequalities one proves that the 
Gibbs state +, associated to the (+) boundary 
conditions, always exists, and that it is invariant 
under horizontal translations of the lattice, that is, 
U^ (e(A 4-a)) 2 u*(o(A)) for all a—(a1,22,0). It is 
also an extremal Gibbs state. Let m/(z) be the 
magnetization p*((o(z)) at the site z=(0,0,z). The 
function m(z) is monotone increasing and satisfies 
the symmetry property m(—z)=—m/(z+1). Some 
consequences of Dobrushin's work are the following 
properties. 
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Theorem 2 If the temperature is low enough, that 
is, if B] > ci, where cy is a given constant, then 


m^ (0) is strictly positive [10] 
m^ (z) —^ m*, when z — oo, exponentially fast [11] 


Equation [10] is just another way of saying that 
the interface is rigid and that the state u® is non- 
translation invariant (in the vertical direction). 
Then, the correlation functions p*(o(A)) describe 
the local properties, or local structure, of the 
macroscopic interface. In particular, the function 
m(z) represents the magnetization profile. Then 
statement [11], together with the symmetry prop- 
erty, tells us that the thickness of this interface is 
finite, with respect to the unit lattice spacing. 

The statistics of interfaces has been rewritten in 
terms of a gas of walls and this system may further 
be studied by cluster expansion techniques. There is 
an interaction between the walls, coming from the 
term U,(A) in eqn [9], but a convenient mathema- 
tical description of this interaction can be obtained 
by applying the standard low-temperature cluster 
expansion, in terms of contours, to the regions 
above and below the interface. 

This method was introduced by Gallavotti in his 
study (mentioned below) of the two-dimensional 
Ising model. It has been applied by Bricmont and 
co-workers to examine the interface structure in the 
present case. As a consequence, it follows that 
the surface tension, more exactly 37*(3), and also 
the correlation functions, are analytic functions at 
low temperatures. They can be obtained as explicit 
convergent series in the variable (=e Y, 

The same analysis applied to the two-dimensional 
model shows a very different behavior at low 
temperatures. In this case, the microscopic interface 
À is a polygonal line and the walls belong to the one- 
dimensional lattice. One can then increase the size of 
a ceiling without modifying the walls attached to it. 

Indeed, Gallavotti turned this observation into a 
proof that the Gibbs state 4^ is now translation 
invariant. The line A undergoes large fluctuations of 
order \/L;, and disappears from any finite region of 
the lattice, in the thermodynamic limit. In particular, 
we have then u*-—(1/2)(u* +p), a result that 
extends to all boundary conditions (+, n). 

Using these results Bricmont and co-workers also 
studied the local structure of the interface at low 
temperatures and showed that its intrinsic thickness 
is finite. To study the global fluctuations, one can 
compute the magnetization profile by introducing, 
before taking the thermodynamic limit, a change of 
scale: u$ (o(zL1)), with 6 — 1/2 or near to this value. 


This is an exact computation that has been done by 
Abraham and Reed. 

Let us come back to the three-dimensional Ising 
model where we know that the interface orthogonal 
to a lattice axis is rigid at low temperatures. 


Question 1 At higher temperatures, but before 
reaching the critical temperature, do the fluctuations 
of this interface become unbounded, in the thermo- 
dynamic limit, so that the corresponding Gibbs state 
is translation invariant? 


One says then that the interface is rough, and it is 
believed that, effectively, the interface becomes 
rough when the temperature is raised, undergoing 
a roughening transition at an inverse temperature 
Br > Bc. 

It is known that Bp < pa. the critical inverse 
temperature of the two-dimensional Ising model, 
since van Beijeren proved, using correlation inequal- 
ities, that above this value, the state p* is not 
translation invariant. Recalling that the rigid inter- 
face may be viewed as a two-dimensional system, 
the system of walls, a representation that would 
become inappropriate for a rough interface, one 
might think that the phase transition of the two- 
dimensional Ising model is relevant for the rough- 
ening transition, and that r is somewhere near 
34-2. Indeed, approximate methods, used by Weeks 
and co-workers give some evidence for the existence 
of such a Br and suggest a value slightly smaller 
than /4-?, as shown in Table 1. To this day, 
however, there appears to be no proof of the fact 
that Br > ße, that is, that the roughening transition 
for the three-dimensional Ising model really occurs. 

At present one is able to study the roughening 
transition rigorously only for some simplified mod- 
els with a restricted set of admissible microscopic 
interfaces. Moreover, the closed contours, describing 
the defects above and below A, are neglected, so that 
these two regions have the constant configurations 1 
or —1, and one has U4(A) =0 in eqn [9]. 

The best known of these models is the classic SOS 
(solid-on-solid) model in which the interfaces \ have 
the property of being cut only once by all vertical 
lines of the lattice. This means that A is the graph of 
a function that can equivalently be used to define 
the possible configurations of A. If A contains the 
horizontal face with center (i¡,12,13 — 1/2), then 


Table 1 Some temperature values 


d=3 Bod ~ 0.22 approximate critical temperature 
d=3 Bard ~ 0.41 conjectured roughening temperature 
g=2 Bod = 0.44 exact critical temperature 
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the value at (í1,1) of the associated function is 
(11,12) = 13. 

The proof that the SOS model with the boundary 
condition (+) has a roughening transition is a highly 
nontrivial result due to Fróhlich and Spencer. When 
3 is small enough, the fluctuations of A are of order 
Vln L (in a cubic box of side L). 

Moreover, other interface models, with additional 
conditions on the allowed microscopic interfaces, 
are exactly solvable. The BCSOS (body-centered 
SOS) model, introduced by van Beijeren, belongs to 
this class. It is, in fact, the first model for which the 
existence of a roughening transition has been 
proved. More recently, also the TISOS (triangular 
Ising SOS) model, introduced by Blóte and Hilhorst 
and further studied by Nienhuis and co-workers, has 
been considered in this context. 

The interested reader can find more information 
and references, concerning the subject of this 
section, in the review article by Abraham (1986). 


Wetting Phenomena 


Next we consider the Ising model over a plane 
horizontal substrate (also called a wall) and study 
the difference of surface tensions which governs the 
wetting properties of this system. 

We first describe the approach developed by 
Frohlich and Pfister (1987) and briefly report some 
results of their study. We consider the model on the 
semi-infinite lattice 


L = {i € Z’: i; > 0} [12] 


A magnetic field, K > 0, is added on the boundary 
sites, /3 = 0, which describes the interaction with the 
substrate, supposed to occupy the complementary 
region LA L’. 

We constrain the model in the finite box A’ = A N Z^, 
with A as above, and impose the value of the spins 
outside. The Hamiltonian becomes 

SY o() [13] 


-J Y olijo(j)- 
¡EN' iz =0 


(1,)0A' Z0 


lay |o) = 


Here o represents the configuration inside A’, the 
pairs (i,j) are contained in £’, and o(i)=a(i) when 
i € A’, the configuration g being the given boundary 
condition (see Figure 2). The corresponding parti- 
tion function is denoted by Z"^(A/). 

Since there are two pure phases in the model, we 
must consider two surface free energies, or surfaces 
tensions, 7^* and 7%, between the wall and the 
positive or negative phase present in the bulk. They 
are defined through the choice of the boundary 


Figure 2 Boundary conditions for the cubic lattice. Above, the 
box A with the (+) and (step) boundary conditions. Below, the 
box A’ and the wall W with the (w —) boundary conditions. 


condition, a(i)=1 or o(i) = —1, for alli € L’. Let us 
consider first the case of the (—) boundary condition. 

The surface free energy contribution per unit area 
due to the presence of the wall, when we have the 
negative phase in the bulk, is 


" (B. K) 
e . 1 ZAN) 
AA Blas ZE 4 


The division by Z-(A)" ? allows us to subtract from 
the total free energy, In Z“"(A”), the bulk term and 
all boundary terms which are not related to the 
presence of the wall. The existence of limit [14] 
follows from correlation inequalities, and we have 
pe 4. 

One can prove, as well, the existence of the Gibbs 
state “> of the semi-infinite system, associated to 
the (—) boundary condition. This state is the limit of 
the finite volume Gibbs measures jw(ow|(-)) 
defined by the Hamiltonian [13]. It describes the 
local equilibrium properties of the system near 
the wall, when deep inside the bulk the system is 
in the negative phase. Similar definitions give the 
surface tension 7"* and the Gibbs state j/*, 
corresponding to the boundary condition o(:) — 1, 
for all ¡€ A’. 

We remark that the states j/"* and yu” are invariant 
by translations parallel to the plane i;=0, and 
introduce the magnetizations, 7:7" (z) =p" (o(z)), 
where z denotes the site (0,0, z), 7^ = m" (0), and 
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similarly m”*(z) and m”*. Their connection with 
the surface free energies is given by the formula 


1"- (B, K) — 1"* (B, K) 
‚K 
Š | (m"*(B,s) —m"-(8,s))ds — [15] 


We mention in the following theorem some 
results of Frohlich and Pfister's study. Here 7^ is, 
as before, the usual surface tension between the 
two pure phases of the system, for a horizontal 
interface. 


Theorem 3 With the above definitions, we have 
AB) —7"*(8, K) < T^(8) [16] 
m'** (B, K) — m" (8, K) > 0 [17] 


and tbe difference in |17] is a monotone decreasing 
function of the parameter K. Moreover, if m'"* — m"-, 
then the Gibbs states u"* and u"- coincide. 


The proof is a subtle application of correlation 
inequalities. Since, from Theorem 3, the integrand in 
eqn [15] is a positive and decreasing function, the 
difference Ar — 7"- — 7“* is a monotone increasing 
and concave (and hence continuous) function of the 
parameter K. On the other hand, one can prove that 
Ar=7*, if K>J. This justifies the following 
definition: 


K,(B) = min(K: Ar(8,K) =r*=(8)) — [18 


In the thermodynamic description of wetting, the 
partial-wetting regime is characterized by the strict 
inequality in [16]. Equivalently, by K < K,(8). We 
must have then ;;"* Z m”, because of eqn [15]. 
This shows that, in the case of partial wetting, j,7* 
and y” are different Gibbs states. 

The complete-wetting regime is characterized by the 
equality in [16], that is, by K > K,,(8). Then, we have 
m'"* =m”, and taking into account the last statement 
in Theorem 3, also p”* = y”. This last result implies 
that there is only one Gibbs state. Thus, complete 
wetting corresponds to unicity of the Gibbs state. 

In this case, we also have lim m" (z) =m*, when 
z — oo, because this is always true for m”*(z). This 
indicates that we are in the positive phase of the 
system although we have used the (—) boundary 
condition, so that the bulk negative phase cannot 
reach the wall anymore. The film of positive phase, 
which wets the wall completely, has an infinite 
thickness with respect to the unit lattice spacing, in 
the thermodynamic limit. 

When 68 -— oo, only a few particular ground con- 
figurations contribute to the partition functions, 
such as the configuration o(i) = —1 for the partition 


function Z"-, etc., and we obtain Ar —2K and 
T* —2]. For nonzero but low temperatures, the 
small perturbations of these ground states have to be 
considered, a problem that can be treated by the 
method of cluster expansions. In fact, the corre- 
sponding defects can be described by closed con- 
tours as in the case of pure phases. 


Theorem 4 For K<J, the functions T" (B,K) 
and Br“*(B,K) are analytic at low temperatures, 
that is, provided that 3(J — K) > c2, where cz is a 
given constant. Moreover, m"*(z) and m™ (z) tend, 
respectively, to m* and to —m', when z-— oe, 
exponentially fast. 


The last statement in Theorem 4 tells us that the 
wall affects only a layer of finite thickness (with 
respect to the lattice spacing). From a macroscopic 
point of view, the negative phase reaches the wall, 
and we are in the partial-wetting regime. Indeed, a 
strict inequality holds in [16]. 

Thus, for K <J there is always partial wetting at 
low temperatures. Then the following question arises: 


Question 2 Is there a situation of complete wetting 
at higher temperatures? It is understood here that K 
takes a fixed value, characteristic of the substrate, 
such that 0 « K « J. 


This is known to be the case in dimension d — 2, 
where the exact value of K,,(8) can be obtained 
from Abraham's solution of the model: 


cosh 2K,, = cosh 26] — e ^? sinh 28] 


Then complete wetting occurs for 8 in the interval 
Be <B < By(K), where B. is the critical inverse 
temperature and w(K) is the solution of K,,(8) — K. 
The case d—2 has been reviewed in Abraham 
(1986). 

To our knowledge, the above question remains an 
open problem for the Ising model in dimension 
d —3. The problem has, however, been solved for 
the simpler case of a SOS interface model. In this 
case, a nice and rather brief proof of the following 
result has been given by Chalker (1982): one has 
71^* =m", and hence complete wetting, if 


28(J — K) « — In(1 — e 97) 


It is very plausible that a similar statement is valid 
for the semi-infinite Ising model and, also that 
Chalker's method could play a role for extending the 
proof to this case, provided an additional assump- 
tion is made. Namely, that 5 is sufficiently large, 
and hence J — K small enough, in order to insure the 
convergence of the cluster expansions and to be able 
to use them. 


Equilibrium Crystals 


The shape of an equilibrium crystal is obtained, 
according to thermodynamics, by minimizing the 
surface free energy between the crystal and the 
medium, for a fixed volume of the crystal phase. 
Given the orientation-dependent surface tension 
T(n), the solution to this variational problem, 
known under the name of Wulff construction, is 
the following set: 


W = {x € R x. n < t(n) for all n} [19] 


Notice that the problem is scale invariant, so that if we 
solve it for a given volume of the crystal, we get the 
solution for other volumes by an appropriate scaling. 
We notice also that the symmetry T(n) — r(—71) is not 
required for the validity of formula [19]. In the present 
case, T(n) is obviously a symmetric function, but 
nonsymmetric situations are also physically interesting 
and appear, for instance, in the case of a drop on a wall 
discussed in the last section. 

The surface tension in the Ising model between 
the positive and negative phases has been defined in 
eqn [7]. In the two-dimensional case, this function 
r(n) has (as shown by Abraham) an exact expression 
in terms of some Onsager's function. It follows (as 
explained in Miracle-Sole (1999)) that the Wulff 
shape W, in the plane (x1,x2), is given by 


cosh 3x1 + cosh 8x; < cosh” 23] / sinh 26] 


This shape reduces to the empty set for 9 < Be, since 
the critical 5. satisfies sinh 2/8, = 1. For 8 B., it is 
a strictly convex set with smooth boundary. 

In the three-dimensional case, only certain inter- 
face models can be exactly solved (see the section 
"Gibbs states and interfaces"). Consider the Ising 
model at zero temperature. The ground configura- 
tions have only one defect, the microscopic interface 
A, imposed by the boundary condition (+,n). Then, 
from eqn [9], we may write 


(Ea(n)— 8^ Na(nm) 20] 


P . n3 
rin u * a Ly Lz 


where E,=2J|A| is the energy (all A have the 
same minimal area) and Na the number of 
ground states. Every such A has the property of 
being cut only once by all straight lines orthogo- 
nal to the diagonal plane i; + i2 + 13 — 0, provided 
that 2, >0, for k=1,2,3. Each A can then be 
described by an integer function defined on a 
triangular plane lattice, the projection of the 
cubic lattice £ on the diagonal plane. The model 
defined by this set of admissible microscopic 
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interfaces is precisely the TISOS model. A similar 
definition can be given for the BCSOS model that 
describes the ground configurations on the body- 
centered cubic lattice. 

From a macroscopic point of view, the roughness 
or the rigidity of an interface should be apparent 
when considering the shape of the equilibrium 
crystal associated with the system. A typical equili- 
brium crystal at low temperatures has smooth plane 
facets linked by rounded edges and corners. The 
area of a particular facet decreases as the tempera- 
ture is raised and the facet finally disappears at a 
temperature characteristic of its orientation. It can 
be argued that the disappearance of the facet 
corresponds to the roughening transition of the 
interface whose orientation is the same as that of the 
considered facet. 

The exactly solvable interface models mentioned 
above, for which the function T(4) has been 
computed, are interesting examples of this behavior, 
and provide a valuable information on several 
aspects of the roughening transition. This subject 
has been reviewed by Abraham (1986), van Beijeren 
and Nolden (1987), and Kotecky (1989). 

For example, we show in Figure 3 the shape 
predicted by the TISOS model (one-eighth of the 
shape because of the condition m,>0). In this 
model, the interfaces orthogonal to the three 
coordinate axes are rigid at low temperatures. 

For the three-dimensional Ising model at positive 
temperatures, the description of the microscopic 
interface, for any orientation m, appears as a very 
difficult problem. It has been possible, however, 
to analyze the interfaces which are very near to 
the particular orientations mo, discussed in the 


Figure 3 Cubic equilibrium crystal shown in a projection 
parallel to the (1,1,1) direction. The three regions (1, 2, and 3) 
indicate the facets and the remaining area represents a curved 
part of the crystal surface. 
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section “Gibbs states and interfaces.” This analysis 
allows us to determine the shape of the facets in a 
rigorous way. 

We first observe that the appearance of a facet 
in the equilibrium crystal shape is related, according 
to the Wulff construction, to the existence of a 
discontinuity in the derivative of the surface 
tension with respect to the orientation. More 
precisely, assume that the surface tension satisfies 
the convexity condition of Theorem 1, and let this 
function 7(”)=7(6,@) be expressed in terms of the 
spherical coordinates of n, the vector no being taken 
as the x3-axis. A facet orthogonal to mp appears in 
the Wulff shape if and only if the derivative 
OT(0,0)/00 is discontinuous at the point 0-— 90, 
for all ¢. The facet F C OW consists of the points 
x € R? belonging to the plane x3=7(mo) and such 
that, for all 4 between 0 and 27, 


xı cos ọ + x2 sin à € OrT(0, $)/00|, S. [21] 


The step free energy is expected to play an 
important role in the facet formation. It is defined 
as the free energy associated with the introduction 
of a step of height 1 on the interface, and can be 
regarded as an order parameter for the roughening 
transition. Let A be a parallelepiped as in the section 
“Pure phases and surface tension," and introduce 
the (step, m) boundary conditions (see Figure 2), 
associated to the unit vectors m= (cos œ, sind) € 
R^, by 


1 if i > 0 or if i3 = 0 and 
a(i) = 11m, + izm > 0 [22] 
—] otherwise 
Then, the step free energy per unit length for a step 


orthogonal to m (with m2>0) on the horizontal 
interface, is 


r“ (G) 


NEUE qucm 
i Li — OO L;—oo L3—00 BL4 Zn (A) 


[23] 


A first result concerning this point was obtained 
by Bricmont and co-workers, by proving a correla- 
tion inequality which establish T*““P(0) as a lower 
bound to the one-sided derivative 07(0,0)/00 at 
0=0* (the inequality extends also to ¢ 4 0). Thus, 
when 75*P > 0, a facet is expected. 

Using the perturbation theory of the horizontal 
interface, it is possible to also study the microscopic 
interfaces associated with the (step, m) boundary 
conditions. When considering these configurations, 


the step may be viewed as an additional defect on 
the rigid interface described in the section “Pure 
phases and surface tension." It is, in fact, a long wall 
going from one side to the other side of the box A. 
The step structure at low temperatures can then be 
analyzed with the help of a new cluster expansion. 
As a consequence of this analysis, we have the 
following theorem. 


Theorem 5 If the temperature is low enough, that 
is, if B] > c3, where c3 is a given constant, then the 
step free energy, T™P(ġ), exists, is strictly positive, 
and extends by positive bomogeneity to a strictly 
convex function. Moreover, Br**P (4) is an analytic 
function of =e’, for which an explicit conver- 
gent series expansion can be found. 


Using the above results on the step structure, 
similar methods allow us to evaluate the increment 
in surface tension of an interface tilted by a very 
small angle 0 with respect to the rigid horizontal 
interface. This increment can be expressed in terms 
of the step free energy, and one obtains the 
following relation. 


Theorem 6 For B] > c3, we have 
O7(9, 0)/00|, 9. = T (9) [24] 


This relation, together with eqn [21], implies that 
one obtains the shape of the facet by means of the 
two-dimensional Wulff construction applied to the 
step free energy. The reader will find a detailed 
discussion on these points, as well as the proofs of 
Theorems 5 and 6, in Miracle-Sole (1995). 

From the properties of 7**P stated in Theorem 5, 
it follows that the Wulff equilibrium crystal presents 
well-defined boundary lines, smooth and without 
straight segments, between a rounded part of the 
crystal surface and the facets parallel to the three 
main lattice planes. 

It is expected, but not proved, that at a higher 
temperature, but before reaching the critical 
temperature, the facets associated with the Ising 
model undergo a roughening transition. It is then 
natural to believe that the equality [24] is true for 
any B larger than £g, allowing us to determine the 
facet shape from eqns [21] and [24], and that for 
B < Br, both sides in this equality vanish, and 
thus, the disappearance of the facet is involved. 
However, the condition that the temperature is 
low enough is needed in the proofs of Theorems 5 
and 6. 


See also: Dimer Problems; Phase Transitions in 
Continuous Systems; Phase Transition Dynamics; 
Two-Dimensional Ising Model; Wulff Droplets. 
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Introduction 


Stochastic differential equations (SDEs) appear 
today as a modeling tool in several sciences as 
telecommunications, economics, finance, biology, 
and quantum field theory. 

An SDE is essentially a classical differential 
equation which is perturbed by a random noise. 
When nothing else is specified, SDE means in fact 
ordinary SDE; in that case it corresponds to the 
perturbation of an ordinary differential equation. 
Stochastic partial differential equations (SPDEs) are 
obtained as random perturbation of partial differ- 
ential equations (PDEs). 

One of the most important difference between 
deterministic and stochastic ordinary differential 
equations is described by the so-called Peano type 
phenomenon. A classical differential equation with 
continuous and linear growth coefficients admits 
global existence but not uniqueness as classical 
calculus text books illustrate studying equations of 
the type 


However, if one perturbs the right member of the 
equality with an additive Gaussian white noise (£;) 
(even with very small intensity), then the problem 
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becomes well stated. A similar phenomenon happens 
with linear PDEs of evolution type perturbed with a 
spacetime white noise. 

SDEs constitute a vast subject and account for an 
incredible amount of relevant contributions. We try 
to orientate the reader about the main axes trying to 
indicate references to the different subfields. We will 
prefer to refer to monographs when available, 
instead of articles. 


Motivation and Preliminaries 


In the whole article T will be a strictly positive real 
number. Let us consider continuous functions 
b:R, x R?>R4, a: R, x R^*" 5 R? and xo € RE 
We consider a differential problem of the following 
type: 


dX, 
dt e b(t, Xi) [1] 
Xo = X0 


Let (Q,F,P) be a complete probability space. 
Suppose that previous equation is perturbed by a 
random noise (£€;);+9. Because of modeling reasons it 
could be reasonable to suppose (&);>ọ satisfying the 
following properties. 


1. It is a family of independent random variables 
(r.v.’s) 

2. (éo is “stationary”, that is, for any positive 
integer a, positive reals h, to, t1, ..., t4, the law of 
(Eths: snah) does not depend on P. 
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More precisely we perturb eqn [1] as follows: 


dX, |— 
r dm b(t, X) + a(t, X,)Es 2] 
Xo = X0 


We suppose for a moment that d =m = 1. In reality no 
reasonable real-valued process (€;),+9 fulfilling pre- 
vious assumptions exists. In particular, if process (£;) 
exists (resp. (€,) exists and each é is a square- 
integrable r.v.), then the process cannot have contin- 
uous paths (resp. it cannot be measurable with respect 
to Q x R,). However, suppose that such a process 
exists; we set B; = h £,ds. In that case, properties (1) 
and (2) can be translated into the following on (B+). 


(P1) It has independent increments, which means 
that for any o,- -sbn 5 > 0, Bias — Bis... 
B, 4 — By, +p are independent r.v.’s. 

(P2) It has stationary increments, which means that 
for any to,...,tn,h 2 0, the law of (Bj, — 
By. ips. Bt, +h — By,., +») does not depend on h. 


On the other hand, it is natural to require that 


(CI) Ba =0 as, 
(C2) it is a continuous process, 
continuous paths a.s. 


that is, it has 


Equation [2] should be rewritten in some integral form 
f 
X; — Xo +/ b(s, X,)ds 
0 


+ J «6 xoas. [3] 


Clearly the paths of process (B+) cannot be differ- 
entiable, so one has to give meaning to integral 
E als, X,)dB,. This will be intended in the “Itó” 
sense, see considerations below. 

An important result of probability theory says 
that a stochastic process (B+) fulfilling properties P1, 
P2 and C1, C2 is essentially a “Brownian motion”. 
More precisely, there are real constants b,a such 
that B, = bt -- cW,, where (W,) is a classical Brow- 
nian motion defined below. 


Definition 1 


(i) A (continuous) stochastic process (W,) is called 
classical “Brownian motion" if Wọ=0 a.s., 
it has independent increments and the law of 
W, — W, is a Gaussian N(0,t— s) rv 

(ii) A m-dimensional Brownian motion is a vector 
(W!,..., W”) of independent classical Brow- 
nian motions. 


Let (7,;),59 be a filtration fulfilling the usual 
conditions, see (Karatzas and Shreve (1991, section 1.1). 


There one can find basic concepts of the theory of 
stochastic processes as the concept of adapted, 
progressively measurable process. An adapted pro- 
cess is also said to be nonanticipating towards the 
filtration (F;) which represents the state of the 
information at each time £. A process (X;) is said to 
be adapted if for any t, X, is F;-measurable. The 
notion of progressively measurable process is a slight 
refinement of the notion of adapted process. 


Definition 2 


(i) A (continuous) (.7;) adapted process (W,) is called 
(classical) (F,)-Brownian motion if Wo —0, if 
for any s < t W, — W, is an N(0,2 — s) distrib- 
uted r.v. which is independent of Fy. 

(ii) An (J;)-m-dimensional Brownian motion is a 
vector (W!,...,W”) of (F,)-classical indepen- 
dent Brownian motions. 


From now on, we will consider a probability 
space (Q,F,P) equipped with a filtration (F;),so 
fulfilling the usual conditions. From now on all the 
considered filtrations will have that property. 

Let W=(W,);>0 be an (F;),>p-m-dimensional clas- 
sical Brownian motion. In Karatzas and Shreve (1991, 
chapter 3) and Revuz and Yor (1999, chapter 4), one 
introduces the notion of stochastic Itó integral 
announced before. Let Y =(Y?,..., Y") be a progres- 
sively measurable m- lateral process such that 
ho | Y;ll^ds < oc, then the Itó integral E Y.dW, is well 
defined. In particolar the indefinite integral Jọ Ysd W, is 
an (F progressively measurable continuous process. 
If Y is an R%” matrix-valued process, the integral 
| > Y,dW, is componentwise defined and it will be a 
vector in R. The analogous of differential calculus in 
the framework of stochastic processes is Itó calculus, 
see again Karatzas and Shreve (1991, chapter 3) and 
Revuz and Yor (1999, chapter 4). Important tools are 
the concept of quadratic variation [X] of a stochastic 
process when it exists. For instance, the quadratic 
variation | W], of a classical Brownian motion equals t. 
If M, — fo YsdW,, then [M], — f; || Y.||'ds. One cele- 
brated theorem of P Lévy states the following: if (Mj) 
defines a continuous (F;)-local martingale such that 
[M,] = t, then M is an (F;)-classical Brownian motion. 
That theorem is called the “Lévy characterization 
theorem of Brownian motion." Itó formula constitutes 
the natural generalization of fundamental theorem of 
differential calculus to the stochastic calculus. Another 
significant tool is Girsanov theorem; it states essen- 
tially the following: suppose that the following so- 
called *Novikov condition" is verified: 


if 
efel a Jar) & ac 


Then the process W,=W, + fj Y.ds,t € [0, T] is 
again an m-dimensional (Ff;)-classical Brownian 
motion under a new probability measure O on 
(Q, FT) defined by 


t 
dO = dP exp (| Y.dW, -31IYPds) 
0 


Let € be an Fo-measurable r.v., for instance, € = 
x € Rf., We are interested in the SDE 


dX, =alt, X;) dW, + b(t, Xj) dt 


e [4] 


Definition 3 A progressively measurable process 
(X;);eqo, rj is said to be solution of [4] if a.s. 


t t 
Xi = 2 +f a(t, X,) dW, +/ b(t, X) dt 
0 0 
vt € [0, T] 


i5] 


provided that the right-hand side member makes 
sense. In particular, such a solution is continuous. 
The function a (resp. b) is called the diffusion (drift) 
coefficient of the SDE. a and b may sometimes be 
allowed to be random; however, this dependence 
has to be progressively measurable. Clearly, we can 
define the notion of solution (X;),55 on the whole 
positive real axis. 


We remark that those equations are called Itó 
SDEs. A solution of previous equation is named 
diffusion process. 


The Lipschitz Case 


The most natural framework for studying the 
existence and uniqueness for SDEs appears when 
the coefficients are Lipschitz. 

A function y:[0,T] x R” — R4 is said to have 
*polynomial growth" (with respect to x uniformly in 
t), if for some n there is a constant C > 0 with 


sup ||y(¢,x)|| € CA + lx") [6] 
t€[0.T| 


The same function is said to have “linear growth” if 
[6] holds with n=1. A function 4: R, x R” — Rf is 
said to be "locally Lipschitz" (with respect to 
x uniformly in t), if for every t € [0, T, K » 0, 
Yo. T] [-K, k] 18 Lipschitz (with respect to x uniformly 
with respect to 1). 

Let a:R, x RÉF" —R4,b:R, x RÉ— R^, be 
Borel functions, £ an R?-valued r.v. Fo-measurable 
and (W,).9 be a m-dimensional (F,)-Brownian 
motion. _ 

Classical fixed-point theorems allow to establish 
the following classical result. 
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Theorem 1 We suppose a and b locally Lipschitz 
with linear growth. Let € be a square-integrable r.v. 
that is Fo-measurable. Then |4| has a unique 
solution X. Moreover, 


E (sp x « oo 
t<T 


(i) Equation [4] can be settled similarly by putting 
initial condition x at some time s. In that case 
the problem is again well stated. If £ =x is a 
deterministic point of Rf, then we will often 
denote by X** the solution of that problem. 

(ii) If the coefficients are only locally Lipschitz, the 
equation may be solved until a stopping time. If 
d — 1, itis possible to state necessary and sufficient 
conditions for nonexplosion (Feller test). 

(iii) The theorem above admits several generaliza- 
tions. For instance, the Brownian motion can be 
replaced by general semimartingales, (possibly 
with jumps as Lévy processes). 


Remark 1 


An important role of diffusion processes is the fact 
that they provide probabilistic representation to 
PDEs of parabolic (and even elliptic) type. We will 
only mention here the parabolic framework. 

We denote A(t,x) — a(t,x)a(t, x), where * means 
transposition for matrices. (t,x) — A(t, x) = (Aj(t, x)) 
is a d x d matrix-valued function. Let us consider also 
continuous functions k:[0, T] x R^ > Rf, g: [0, T] x 
R^ — Rf with polynomial growth or non-negative. 

Given a solution of [4], we can associate its 
generator (L,,t € [0, T]) setting 


d 
Lif x) = 5 Y) Ast, df (x) + bt, x) - Vf (x) 
ij-l 


Feynman-Kac theorem is stated below and it 
provides probabilistic representation of an asso- 
ciated parabolic linear PDEs. 


Theorem 2 Suppose there is a function v:|0,T| x 
R?— R^ continuous with polynomial growth of 
clas C'^?*([0, T] x RÍ) satisfying the following 
Caucby problem: 

(Ov T Ly)v —ku- g 


aT x) = f(x) 7 


v(s,x)=E (ron) exp (- "n k(0, Xp) do) 


-[ sexye- f t. xoao] ar) 
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for (s,x) € [0, T] x Rf, where X=X**. In particu- 
lar, such a solution is unique. 


Remark 2 


(1) In order to obtain “classical solutions” of the 
above Cauchy problem, one needs some condi- 
tions. It is the case, for instance, when the 
following ellipticity condition holds on A: 


de> 0, V(t,x) € [O, T] x R”, V(£i,...,£,) ER” 
d d 
5 Aii(t,x) GG scs lel [8] 
ij i=l 


In the degenerate case, it is possible to deal with 
viscosity solutions, in the sense of P L Lions. 
This theorem establishes an important link 
between deterministic PDEs and SDEs. 
(ii) A natural generalization of Feynman-Kac theo- 
rem comes from the system of forward-backward 
SDEs in the sense of Pardoux and Peng. 
Other types of probabilistic representation do 
appear in stochastic control theory through the 
so-called verification theorems, see for instance, 
Fleming and Soner (1993) and Yong and Zhou 
(1999). In that case, the (nonlinear) Hamilton- 
Jacobi-Bellmann deterministic equation is 
represented by a controlled SDE. 
Another bridge between nonlinear PDEs and 
diffusions can be provided in the framework of 
interacting particle systems with chaos propaga- 
tion, see Graham et al. (1996) for a survey on 
those problems. Among the most significant 
nonlinear PDEs investigated probabilistically, we 
quote the case of porous media equations. For 
instance, for a positive integer m, a solution to 


Oyu = 30 (un) i9] 


— 


(iii 


— 


(1v 


can be represented by a (nonlinear) diffusion of 
the type, see Benachour et al. (1996), 


d X, 2 wu" (s, X,) dW, 


u(t,-) — 0 


law density of X, 


Different Notions of Solutions 


Let a and b as at the beginning of the previous 
section. Let (Q,F,P) be a probability space, a 
filtration (F+)¿>p0 fulfilling the usual conditions, an 
(F:);>0-classical Brownian motion (W;),+9. Let € be 
an Fo-measurable r.v. In the section “Motivation 
and preliminaries," we defined the notion of solu- 
tion of the following equation: 


d X, — b(t, X4) dt T a(t, X) dW, 


"- [11] 


This equation will be denoted by E(a, b) (without initial 
condition). However, as we will see, the general 
concept of solution of an SDE is more sophisticated 
and subtle than in the deterministic case. We distin- 
guish several variants of existence and uniqueness. 


Definition 4 (Strong existence). We will say that 
equation E(a,b) admits strong existence if the 
following holds. Given any probability space 
(Q, F,P), a filtration (F;),.9, an (F;),+9-Brownian 
motion (W;),.9, an Fo-measurable and square- 
integrable r.v. €, there is a process (X,),p solution 
to E(a, b) with Xo — € a.s. 5 


Definition 5 (Pathwise uniqueness). We will say 
that equation E(a,b) admits pathwise uniqueness if 
the following property is fulfilled. Let (Q, F, P) be a 
probability space, a filtration (F;),.9, an (7;),so 
Brownian motion (W,),..9. If two processes X, X are 
two solutions such that Xy = Xo a.s., then X and X 
coincide. 


Definition 6 (Existence in law or weak existence). 
Let v be a probability law on Rt. We will say that 
E(a,b;v) admits weak existence if there is a 
probability space (Q, 7, P), a filtration (F;),.9, an 
(F+),>9-Brownian motion (W;),>p, and a process 
(X;),>9 solution of E(a, b) with v being the law of Xo. 

We say that E(a,b) admits weak existence if 
E(a, b; v) admits weak existence for every v. 


Definition 7 (Uniqueness in law). Let v be a 
probability law on R. We say that E(a, b; v) has a 
unique solution in law if the following holds. We 
consider an arbitrary probability space (Q, F, P) and 
a filtration (F;),+9 on it; we consider also another 
probability space (Q,.£, P) equipped with another 
filtration (Feliso; we consider an (F;),.9-Brownian 
motion (W;),.9, and an (F;),~9-Brownian motion 
(Wero; we suppose having a process (X;),. (resp. a 
process (X;),.9) solution of E(a, b) on the first (resp. 
on the second) probability space such that both the 
law of Xp and Xo are identical to v. Then X and X 
must have the same law as r.v. with values in 
E=C(R,) (or C[0, T]). 

We say that E(a, b) has a unique solution in law if 
E(a, b; v) has a unique solution in law for every v. 


There are important theorems which establish 
bridges among the preceding notions. One of the 
most celebrated is the following. 


Proposition 1 (Yamada-Watanabe). Consider tbe 


equation E(a, b). 


(i) Pathwise uniqueness implies uniqueness in law. 
(ii) Weak existence and pathwise uniqueness imply 
strong existence. 


A version can be stated for E(a,b;v) where v is a 
fixed probability law. 


Remark 3 


(i) If a and b are locally Lipschitz with linear 
growth, Theorem [1] implies that E(a, b) admits 
strong existence and pathwise uniqueness. 

(ii) If a and b are only locally Lipschitz, then 
pathwise uniqueness is fulfilled. 


Existence and Uniqueness in Law 


A way to create weak solutions of E(1,b) when 
(t, x) + b(t,x) is Borel with linear growth is the 
Girsanov theorem. Suppose d — 1 for simplicity. Let 
us consider an (F,)-classical Brownian motion (X;). 
We set 


t 
W= X,- / b(s, X,)ds 
0 


Under some suitable probability O, (W,) is an (F;)- 
classical Brownian motion. Therefore, (X;) provides 
a solution to E(1, b; ôo). 

We continue with an example where E(a, b) does 
not admit pathwise uniqueness, even though it 
admits uniqueness in law. 


Example 1 We consider the stochastic equation 


T 
x - | sign( X,)d W, [12] 
0 
with 
| > 
ES 


It corresponds to E(a,b;é9) with b=0 and 


a(x) = sign(x). 


If (W;),59 is an (F;)-classical Brownian motion, 
then (X;),59 is (Fz);s9-continuous local martingale 
vanishing at zero such that [X], « t. According to 
Lévy characterization theorem stated earlier, X is an 
(F;),+9-classical Brownian motion. This shows in 
particular that E(a,b;69) admits uniqueness in law. 
In the sequel, we will show that E(a,b;ó00) also 
admits weak existence. 

Let now (0,7,P) be a probability space, an 
(F;);+9-classical Brownian motion with respect to a 
filtration and (X;),+9 such that [12] is verified. Then 
X,— —X, can also be shown to be a solution. 
Therefore, E(a,b;69) does not admit pathwise 
uniqueness. 

We continue stating a result true in the multi- 
dimensional case. 
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Proposition 3 (Stroock-Varadhan). Let v be a 
probability on R* such that 
| Je da) < +00 113] 
R 


for a certain m » 1. We suppose that a,b are 
continuous with linear growth. Then E(a,b;v) 
admits weak existence. 


From now on, a function y:[0, T] x R” — R4 will 
be said Hólder-continuous if it is Hólder-continuous 
in the space variable x € R" uniformly with respect 
to the time variable ¢ € [0, T]. 

Stroock and Varadhan (1979) also provide the 
following result, which is an easy consequence of 
their theorem 7.2.1. 


Proposition 4 We suppose a,b both  Holder- 
continuous, bounded such that condition; [8] is 
fulfilled. Then SDE E(a, b; v) admits weak uniqueness. 


Remark 4 


(i) The Hólder condition and [8] in Proposition 4 
may be relaxed and replaced with the solva- 
bility of a Cauchy problem of a parabolic PDE 
with suitable terminal value. 

(ii) In the case d— 1, if a, b are bounded and just 
Borel with [8] for x on each compact, then 
E(a, b; v) admits weak existence and uniqueness 
in law. See Stroock and Varadhan (1979, 
exercises 7.3.2 and 7.3.3). 

(iii) If d —2, the same holds as at previous point 
provided that moreover a does not depend on 
time. 


We proceed with some more specifically unidi- 
mensional material stating some results from 
K ] Engelbert and W Schmidt, who furnished 
necessary and sufficient conditions for weak exis- 
tence and uniqueness in law of SDEs. 

For a Borel function e: R — R, we first define 


Z(c) = {x € Rlo(x) = 0) 


then we define the set I(c) as the set of real numbers 
x such that | 


XTE d 
/ Z =00, Ve>0 
xe 0%(y) 


- oly 


Proposition 5 (Engelbert-Schmidt criterion). Sup- 
pose that a: IR — R, that is, does not depend on time 
and we consider the equation without drift E(a, 0). 


(i) E(a, 0) admits weak existence (without explo- 
sion) if and only if 


I(a) c Z(a) [14] 
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(ii) E(a, 0) admits weak existence and uniqueness in 
law if and only if 


I(a) = Z(a) [15] 
Remark 5 


(i) If a is continuous then, [14] is always verified. 
Indeed, if a(x) Z 0, there is £ > 0 such that 


laCy)| > 0, 


Therefore, x cannot belong to I(a). 

(ii) Equation [14] is verified also for some discon- 
tinuous functions as, for instance, a(x)— 
sign(x). This confirms what was affirmed 
previously, that is, the weak existence (and 
uniqueness in law) for E(a, 0). 

(iii) If a(x) = 140)(x), [14] is not verified. 

(iv) If a(x) =|x|",a@ > 1/2, then 


Vy € x -—e,x+6] 


Z(a) = I(a) = {0} 


So there is at most one solution in law for 
E(a, 0). 

(v) The proof is technical and makes use of 
Lévy characterisation theorem of Brownian 
motion. 


Results on Pathwise Uniqueness 


Proposition 6 (Yamada—Watanabe). Let a,b:R,x 
R=>R and consider again E(a,b). Suppose b 
globally Lipschitz and b:R, —R, strictly increas- 
ing continuous such that 


(i) (0) —0; 

(ii) fy (1/b^)(y)dy—oo, Ve > 0; and 
(iii) |a(t, x) — a(t, y)| € h(x — y). 

Then pathwise uniqueness is verified. 
Remark 6 

(1) In Proposition 6, one 

b(u) =u%, a > 1/2. 
(ii) Pathwise uniqueness for E(a,b) holds therefore 


if b is globally Lipschitz and a is Holder- 
continuous with parameter equal to 1/2. 


typical choice is 


Corollary 1 Suppose that the assumptions of 
Proposition 6 are verified and a,b continuous with 
linear growth. Then E(a,b;v) admits strong exis- 
tence and pathwise uniqueness, whenever v verifies 
condition [13]. 


Proof It follows from Propositions 6 and 3 
together with Proposition 1 (ii). O 


Remark 7 Suppose d=1. Pathwise uniqueness for 
E(a, b) also holds under the following assumptions. 


(1) a, b are bounded, a is time independent and 
a > const. > 0, h as in Proposition 6. This result 
has an analogous form in the case of spacetime 
white noise driven SPDEs of parabolic type, as 
proved by Bally, Gyongy, and Pardoux in 1994. 

(ii) a independent on time, b bounded and a> 
const. > 0; moreover, |a(x) — a(y)|* < If(y) — 
f (x)| and f is increasing and bounded. 


For illustration we provide some significant 
examples. 


Example 2 


t 
x - | IX,^dW, £20 [16 
0 


We set a(x)=|x|",O0<a<1. This is equation 
E(a,0) with a(x)=|x|". According to Engelbert- 
Schmidt notations, we have Z(a) — {0}. Moreover 


(i) If œ > 1/2, then I(a) — (0). 
(ii) If a < 1/2 then I(a) — 0. 


Therefore, according to Proposition 5, E(a, 0) admits 
weak existence. On the other hand, if a > 1/2, 


[34] S be — y) [17] 


where b(z)—z?. According to Proposition 6, [16] 
admits pathwise uniqueness and by Corollary [1], 
also strong existence. The unique solution is X = 0. 
If a < 1/2, X = 0 is always a solution. This is not 
the only one; even uniqueness in law is not true. 


Example 3 Let a(x)— V|x|,b Lipschitz. Then 
E(a, b) admits strong existence and pathwise unique- 
ness. In fact, a is Hélder-continuous with parameter 
1/2 and the second item of Remark 6 applies; so 
pathwise uniqueness holds. Strong existence is a 
consequence of Propositions 3 and 1 (ii). 

An interesting particular case is provided by the 
following equation. Let xo,0,0 2 O0,R € R. The 
following equation admits strong existence and 
pathwise uniqueness. 


t t 
Zi =x0 +0 | viziaw, =+ | (6 — RX;) ds 
0 0 
t € [0, T] [18] 


Equation [18] is widely used in mathematical finance 
and it constitutes the model of Cox-Ingersoll-Ross: 
the solution of the mentioned equation represents the 
short interest rate. 

Consider now the particular case where k=0, 
o=2. According to some comparison theorem for 
SDEs, the solution Z is always non-negative and 


therefore the absolute value may be omitted. The 
equation becomes 


t 
Ze =x0 +2 | v Z,dW, + ót [19] 
0 


Definition 8 The unique solution Z to 


t 
Z=x0 +2 | v Z,dW, + ôt [20] 
0 


is called “square ó-dimensional Bessel process" 
starting at xo; it is denoted by BESQ" (xo); for fine 
properties of this process, see Revuz and Yor (1999, 
ch. IX.3). 

Since Z > 0, we call ó-dimensional Bessel process 
starting from xo the process X — VZ. It is denoted 
by BES’ (xo). 


Remark 8 Let d » 1. Let W —(W!,..., W4) be a 
classical d-dimensional Brownian motion. We set 
X, — || W;||. (Xi),39 is a d-dimensional Bessel process. 


Remark 9 If ó > 1, it is possible to see that 


6—1 fds 
sio an 
: MO o X, 


The Case with Distributional Drift 


Pioneering work about diffusions with generalized 
drift was presented by N I Portenko, but in the 
framework of semimartingale processes. Recently, 
some work was done characterizing solutions in the 
class of the so-called Dirichlet processes, with some 
motivations in random irregular environment. 

A useful transformation in the theory of SDE is 
the so-called “Zvonkin transformation.” Let (W,) be 
an (F¿)-classical Brownian motion. Let a (resp. b) : 
R—R (resp. C!) be locally bounded. We suppose 
moreover 4>0. We fix xo € R. Let (X;),59 be a 
solution of 


t 
X, = XO «f b(X,) ds 
0 
t 


+j a(X,) dW, [21] 


We set 


*2b 
Bx) = | Fw 
and we define b: IRR such that 
b(0) = 0, p-e? 


b is strictly increasing. We set a(x) — (ab')(b  (x)), 
where b^! is the inverse of b. We set Y, —b(X,). 
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Without entering into details, the classical Ito 
formula allows to show that (Y;) defines a solution of 


dY, — a(Y;,) dW, 


Yo = hist] 22 


Now, eqn [22] fulfills the requirements of the 
Engelbert-Schmidt criterion so that it admits weak 
existence and uniqueness in law. Consequently, 
unless explosion, one can easily establish the same 
well-posedness for [21]. 

Zvonkin transformation also allows to prove 
strong existence and pathwise uniqueness results 
for [21]; for instance, when 


e 4 has linear growth, and 
e E Í Y b(s) d 
j o a(s) i 


is a bounded function. 


In fact, problem [22] satisfies pathwise uniqueness 
and strong existence since the coefficients are 
Lipschitz with linear growth. Therefore, one can 
deduce the same for [21]. 

Veretennikov generalized Zvonkin transformation 
to the d-dimensional case in some cases which 
include the case a — 1 and b bounded Borel. 

Zvonkin's procedure suggests also to consider a 
formal equation of the type 


where y is only a continuous function and so b — / 
is a Schwartz distribution; ^ could be, for instance, 
the realization of an independent Brownian motion 
of W. Therefore, eqn [23] is motivated by the study 
of irregular random media. When o — 1, 5 —', SDE 
[22], hb! =e 7?" still makes sense. 

Using the Engelbert-Schmidt criterion, one can see 
that problem [22] still admits weak existence and 
uniqueness in the sense of distribution laws. If Y is a 
solution of [22], X=h'(Y) provides a natural 
candidate solution for [21]. R F Bass, Z-Q Chen and 
F Flandoli, F Russo, and J Wolf investigated general- 
ized SDEs as [23]: in particular, they made previous 
reasoning rigorous, respectively, in the case of strong 
and weak solutions, see Flandoli et al. (2003). 


Connected Topics 

We aim here at giving some basic references about 
topics which are closely connected to SDEs. 
Stochastic Partial Differential Equations (SPDEs) 


If a SDE is a random perturbation of an ordinary 
differential equation, an SPDE is a random 
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perturbation of a PDE. Several studies were 
performed in the parabolic (evolution equation) 
and hyperbolic case (wave equation). Most of the 
work was done in the case of a fixed underlying 
probability spaces. We only quote two basic 
monographies which should be consulted at first 
before getting into the subject: the one of Walsh 
(1986) and the one of Da Prato and Zabczyk 
(1992). 

However, it was possible to establish some results 
about weak existence and uniqueness in law for 
SPDEs. One possible tool was a generalization of 
Girsanov theorem to the case of Gaussian spacetime 
white noise. Weak existence for the stochastic 
quantization equation was proved with the help of 
infinite-dimensional Dirichlet forms by S Albeverio 
and M Róckner. 

We also indicate a beautiful recent monography 
by Da Prato (2004) which pays particular attention 
to Kolmogorov equations with infinitely many 
variables. 


Numerical Approximations 


Relevant work was done in numerical approxima- 
tion of solutions to SDEs and related approxima- 
tions of solutions to linear parabolic equations 
via Feynman-Kac probabilistic representation, see 
Theorem 2). It seems that the stochastic simulations 
(of improved Monte Carlo type and related topics) 
for solving deterministic problems are efficient when 
the space dimension is greater than 4. 


Malliavin Calculus 


Malliavin calculus is a wide topic (see Malliavin 
Calculus). Relevant applications of it concern 
stochastic (ordinary and partial) differential equa- 
tions. We only quote a monography of Nualart 
(1995) on those applications. Two main objects 
were studied. 


e Given a solution of an SDE, (X;), sufficient 
conditions so that X;,t>0, has a (smooth) 
density p(t,-). Small-time asymptotics of this 
density, when ? — 0, and small-drift perturbation 
were performed, refining Freidlin-Ventsell large- 
deviation estimates. 

e Coming back to SDE [11], one can conceive to 
consider coefficients a, b nonadapted with respect 
to the underlying filtration (F,). On the other 
hand, the initial condition £ may be anticipating, 
that is, not Fo-measurable. In that case, the Itó 
integral Lh a(s, Xs)d W, is not defined. A replace- 
ment tool is the so-called “Skorohod integral." 


Rough Paths Approach 


A very successful and significant research field is the 
rough path theory. In the case of dimension d — 1, 
Doss-Sussmann method allows to transform the 
solution of an SDE into the solution of an ordinary 
(random) differential equation. In particular, that 
solution can be seen as depending (pathwise) 
continuously from the driving Brownian motion 
(W,) with respect to the usual topology of C([0, T]). 
Unless exceptions, this continuity does not hold in 
case of general dimension d » 1. Rough paths 
theory, introduced by T Lyons, allows to recover 
somehow this lack of continuity and establishes a 
true pathwise stochastic integration. 


SDEs Driven by Non-semimartingales 


At the moment, there is a very intense activity 
towards SDEs driven by processes which are not 
semimartingales. In this perspective, we list SDEs 
driven by fractional Brownian motion with the help 
of rough paths theory, using fractional and Young 
type integrals and involving finite cubic variation 
processes. Among the contributors in that area we 
quote L Coutin, R Coviello, M Errami, M Gubinelli, 
Z Qian, F Russo, P Vallois, and M Zàhle. 


See also: Fractal Dimensions in Dynamics; Image 
Processing: Mathematics; Interacting Stochastic Particle 
Systems; Lagrangian Dispersion (Passive Scalar); 
Malliavin Calculus; Path Integrals in Noncommutative 
Geometry; Quantum Dynamical Semigroups; Quantum 
Fields with Indefinite Metric: Non-Trivial Models; Random 
Dynamical Systems; Random Walks in Random 
Environments; Stochastic Hydrodynamics; Stochastic 
Resonance. 
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Introduction 


Mathematical models in hydrodynamics are intro- 
duced to describe the motion of fluids. The basic 
equations for Newtonian incompressible fluids are 
the Euler and the Navier-Stokes equations, for 
inviscid and viscous fluids, respectively. For a given 
set of body forces acting on the fluid, these 
nonlinear partial differential equations (PDEs) 
model the evolution in time of the velocity and 
pressure at each point of the fluid, given the initial 
velocity and suitable boundary conditions (see 
Partial Differential Equations: Some Examples). 
The equations of hydrodynamics offer challenging 
mathematical problems, like proving the existence 
and uniqueness of solutions, determining their 
regularity, their asymptotic behavior for large time, 
and their stability. To gain some insight into the 
behavior of fluids, stochastic analysis is introduced 
into hydrodynamics. In fact, there are various 
attempts to describe turbulent regime (see Turbu- 
lence Theories). But, analyzing individual solutions 
that determine the flow at any time, for a given 
initial condition, is a desperate task, since the 
dynamics in a turbulent regime is chaotic and highly 
unstable. This is a particular chaotic motion with 
some characteristic statistical properties (see Monin 
and Yaglom (1987)). The aim of a statistical 
description of turbulent flow is to single out some 
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relevant collective properties of the flow that, 
hopefully, make it possible to grasp the salient 
features of the dynamics. In this sense, stochastic 
hydrodynamics is germane to the kinetic gas theory. 
In the next section we shall review a typical topic of 
stochastic hydrodynamics, the evolution of prob- 
ability measures. Results on stationary probability 
measures will be given in the subsequent sections. 

Another characteristic of turbulent flows is the lack 
of space regularity of the velocity field. We shall 
introduce in the section “The stochastic Navier— 
Stokes equations” a stochastic model of turbulence, 
which exhibits lack of regularity of the solutions. 

The Euler equations are a singular limit of the 
Navier-Stokes equations, since they are first order, 
instead of second-order PDEs. It is little surprise if they 
involve different mathematical techniques. A full sec- 
tion will be devoted to a discussion of Euler equations 
and another to the Navier-Stokes equations. Statistics 
of an inviscid flow, when approximated by vortex 
motion, will be described in the final section. 


Statistical Solutions 


Let u(t, x) be the fluid velocity at time ¢ and point 
xcD CR; since the initial velocity is always 
affected by experimental errors, it is reasonable to 
assign a measure v determining the probability that 
the initial velocity belongs to a Borel set I’ of the 
space H of all admissible velocity fields u = u(x). 

A spatial statistical solution is a family of 
probability measures j4(t,-),£ > 0, each supported 
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on the set H such that, given any Borel set I in H, 
we have 


Prob{u(t,x) ET} = u(t, D), Vt>0 [1] 


with the initial condition 4(0, I) —»(D). The con- 
struction and analysis of statistical solutions u(t, -) is 
one of the crucial mathematical problems in 
stochastic hydrodynamics (see, e.g., Vishik and 
Fursikov (1988)). 

Hopf gave the first mathematical formulation of 
the problem of describing turbulent flows by 
statistical solutions. The first result on the existence 
of statistical solutions is by Foias in 1973. Hopf 
(1952) presented an equation in variational deriva- 
tives satisfied by the characteristic functional y(t, à) 
of the family of measures u(t, -) associated with the 
Navier-Stokes equations. The characteristic func- 
tional y(t, à) is the Fourier transform of the measure 
ut, dE 


d= H e") u(t, du) 2 


defined for any smooth test function à. 

We now derive the evolution equation for x(t, ¢), 
by assuming that the dynamics takes place in the 
phase space H and follows the nonlinear equation 


du 
dt = F(u) [3] 


If u’(t) is the solution started from v at time 1 — 0, 
then its probability distribution is represented by 
the time-evolved measure s(t,-). Therefore, we 
have that 


f "ut. = [ e^ uto, av) [4] 
H 


H 


Differentiating in time, we obtain 
gx = | e mitos Fe (e)ul, dv) 


zi / e^? (9, Flu) u(t, dv) 5] 
H 


The last integral is uniquely determined by x, since 
the measure u(t, -) is uniquely determined by x(t, à). 
We denote by ®y(t, $) the last integral in [5]. The 
evolution equation thus obtained for the character- 
istic functional y is 


Exa g =i0x(t,6). Y j6] 


This is called the Hopf equation associated with the 
dynamical system [3]. 

Another way to analyze the evolution of measures 
is through the moments; instead of the measure u(t, -) 


describing the spatial statistical solution, we deal with 
the moments of j(t,-) of any order. For a nonlinear 
dynamics [3], the moments equations are an infinite 
chain of coupled equations, the so-called Friedman- 
Keller equations. 

A prominent role among statistical solutions is 
played by stationary solutions. They contain all the 
statistical information in the case of equilibrium in 
time. We have that the characteristic functional of 
an invariant measure is constant in time. Therefore, 


© x(t,4) — 0 
Bearing in mind equation [5], this is equivalent 
to say that the signed measure (6, F(v)) u(t, dv) vani- 
shes, for any test function ¢ and time t. Setting t — 0, 
we obtain that an invariant measure v in the space 
H satisfies the Liouville equation 


/ (4(v), F(v)) dv(v) = 0 7 


for appropriate test functions ¢. This equation is 
also called the relation of infinitesimal invariance 
and the measure v is said to be infinitesimally 
invariant. 

The stationary measures are natural candidates to 
describe the statistical asymptotic behavior of the 
system when t — oo. Notice that, in a chaotic system 
two motions that are arbitrarily close to one another at 
t — 0 can evolve in completely different ways. So, to 
describe satisfactorily the dynamics we take average 
over a big number of experiments. This is the so-called 
ensemble average. These averages are assumed to be 
with respect to an invariant measure u. The invariant 
measures must exist and either they are unique or at 
most one has physical meaning and enters in the 
functional integral defining the ensemble average. 
According to the ergodic principle (an assumption not 
yet proved in hydrodynamics), ensemble averages 
replace long-time averages: for every initial velocity 
field v, except for a set of initial values negligible in 
some sense, the time average of an observable w tends, 
as time goes to infinity, to the ensemble average 


Too T 


1 E 
lim = y(u” = vdu 
im / i(u" (t)) dt f idu [8] 


However, it is extremely difficult to prove the 
existence of stationary probability measures for the 
Navier-Stokes equations solving directly equation 
[7]. The situation is formally the same as in 
equilibrium statistical mechanics, where the Liouville 
equation is in fact solved, leading to the Boltzmann- 
Gibbs distribution. However, the results in statistical 
hydrodynamics are far from being satisfactory. 


Recent studies to prove the existence of invariant 
measures for the Navier-Stokes equations are based 
on stochastic models (see the section “The stochastic 
Navier-Stokes equations”). On the other hand, for 
the Euler equations it is possible to construct 
formally invariant measures, by means of invariant 
quantities of the classical motion (see the next 
section). 

Finally, we point out that there are techniques 
using invariant measures to show some results for 
the time evolution (e.g., the motion exists for almost 
all initial values with respect to an invariant 
measure). 


The Euler Equations 


We start recalling some basic facts on Euler 
equations (see Incompressible Euler Equations: 
Mathematical Theory). 

The motion of an inviscid, incompressible, and 
homogeneous fluid is described by the Euler 
equations, which in Eulerian coordinates read as 


Ou 
Ot 
V.u=0 

u-n=0 on ðD 


+(u-V)u+Vp=f 
in D [9] 


where, at time 7 > 0 and position x € D,u=u(t, x) 
is the vector velocity, p — p(t, x) the hydrodynamic 
pressure. The units have been chosen so that the 
mass density p=1. V denotes the nabla vector 
Operator so 


V- u= : Ou; 
ra Ox; 
Op Op 
vy - (de... P) 


Finally, f denotes the external force. If the spatial 
domain D has a boundary 0D, then the velocity is 
assumed to be tangent to the boundary (z denotes 
the exterior normal vector to the boundary). Some 
initial condition up at time t=0 is assigned. 

When f=0, there are invariant quantities for 
system [9]. In the literature, there are many works 
suggesting a Gaussian stationary statistics (see, e.g., 
the paper by Kraichnan (1980)) We consider 
invariants that are quadratic in the velocity so as 
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to construct (formally) invariant measures of Gibbs 
type: the energy 


E(u) := J ll? dx 


and, only in the two-dimensional case (d=2), the 
enstrophy 


Siu) = J) Icurl u|? dx 


(with curl u=V+*-.u=0u>/0x, —Ou;/Ox2 for d=2). 

It is natural to look for velocity fields in the 
following function spaces: the space H of finite 
kinetic energy and the space H! of finite enstrophy. 
Clearly, the admissible fields should also obey the 
boundary conditions and divergence-free condition. 
If P is the projection operator onto the space of 
divergence-free vectors, and B is the bilinear form 
B(u,v):—P|(u- V)v], the Euler equations can be 
given the structure of an evolution, 


— = —B(u,u) [10] 


obtained by applying the projection operator P to 
the first equation in [9]. The pressure disappears and 
can be regarded as a Lagrange multiplier associated 
with the divergence-free constraint (V - u = 0); it can 
be fully recovered once the velocity field is known. 
The dynamics is considered in the phase space of 
divergence-free velocity vectors H (a large space 
containing H? and H!), which is an infinite- 
dimensional functional space. More precisely, iden- 
tifying H? with its dual (H?), we introduce the 
Gelfand's triplet 


H' c H? c (H' =H" 


The space H*, with a=1,2,..., are the usual 
Sobolev spaces but with the additional divergence- 
free and boundary conditions. For a > 0 noninteger, 
the spaces H® are defined by interpolation, whereas 
those with o « 0 by duality. As usual, regularity in 
space is related to the spaces H® with higher 
exponent a. We have that H = U4cg H^. 

Invariance of € and S can be proved resorting to 
eqn [9] and assuming that u is a smooth vector field. 
For instance, 
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By integrating by parts and bearing in mind the 
divergence-free condition and the boundary condi- 
tion, we conclude that 


In the same way, the invariance of S can be proved. 
As a consequence, the following Gibbs measures 
which are defined on the space H 


1 [11] 
us(du) =— e du 
Zs 

are heuristically invariant in time. In [11], Z. are the 
partition functions, that is, they are normalization 
constants needed to guarantee that ys and us are 
genuine probability measures (e.g., Ze = fj, e du). 

Actually, these measures y solve the Liouville 
equation 


H (du), Bu, u))du(u) = 0 [12] 


for any test function $, cylindrical, infinitely differ- 
entiable, bounded, and with bounded derivatives. 

On the other hand, the (global and not only 
infinitesimal) invariance means that if there exists a 
global flow in time which is well defined in a phase 
space of full measure jz, then the measure p is invariant 
under this dynamics. The measures pig and us are 
centered Gaussian measures whose support is in a 
space larger than H?, as can be proved by standard 
methods in the theory of Gaussian measures on 
infinite-dimensional spaces. By the very definition, pe 
is a cylindrical measure in H? and ¡us is cylindrical in 
H!. Then the support of pg is any Hilbert space H such 
that H? C H is a Hilbert-Schmidt embedding, and the 
support of us is any space H such that H! C H is a 
Hilbert-Schmidt embedding. When the spatial dimen- 
sion d is 2, supp(ug) = Ma<-1 H* and supp(ps) — Ma<o 
H^. When d is 3, supp(ug) = Ma<-3/2 H*. 

Moreover, ji¢(H°) = us(H?) — 0, that is, the space 
of finite energy H is negligible with respect to these 
measures. Let us show this property for the 
“enstrophy measure” jus when d — 2. Let [ej];- , be 
a complete orthonormal system in H?. Hence, for 
u—Y;wej, we have || = X; lu? and llull = 
y» ju; (with 0 < à € A €--- and A;^—j as 
j — oo). Keeping in mind its definition, the measure 
us can be considered as a measure on the space of 
the sequences {u;}; and written as an infinite product 
of one-dimensional centered Gaussian measures 

jus (du) - — e ~(Aj/2) |u|? du; [13] 
\/ 20d; * 


The energy is 
1 
E(u) -5 5 lul 
j 
and the renormalized energy is 


4:6) 33 (w — f Pustan) 


Since, as can be easily shown | (:€: (u)) us(du) 
« oo, :£:(u) is finite for jzs-almost every u. On the 
contrary, since >, f Iu; us(du) = » Aj = +00, 
E(u) is infinite for jjs-almost every u. 

We also note in passing that, for any y > 0 and 
SY 


P e ?S(w) 
/ e ^^6w)7 du < 00 
H JL 


so that 
e FE(u):—7S(4) 


(8).(v) — ——— 
Hs (du) o fe FFE -84) du 


du [14] 
is a probability measure, which is infinitesimally 
invariant for the Euler flow. 

Since the space of finite-energy velocity is negligible 
with respect to these measures, it is necessary to 
replace the classical solutions having finite energy with 
generalized solutions. This is not an easy task in the 
three-dimensional case, whereas some results have 
been proved for the two-dimensional problem, where 
the following existence result holds. Let us analyze 
the quadratic term B(u, u) = —P[(u - V)u].(u - V)u can 
be rewritten as V(u&u), taking in account the 
divergence-free condition. Trivially, we have that 
V(u@u)=ViuQ@u—:u@u:), where :uQu: = 
fu & u; us(du). We consider the quadratic expres- 
sion (u & u —:u &u:). This is integrable with respect 
to the measure jis in the sense that 


J je @u—:u@u: |\t<ps(du)<oo [15] 


for any ¢>0. We remark that this property is 
similar to the integrability of the renormalized 
energy, which is a quadratic expression as well. 
This implies that the H ' ^-norm of V(u Su) is 
integrable with respect to the measure js. There- 
fore, B(u,u) is defined for j/5-a.e. u. 

Now, let us replace eqn [10] with a system of infinite 
equations for all the components u; with respect to 
the orthonormal basis {e;};, obtained by taking the 
scalar product with e; of both sides of eqn [10]: 


= UR [16] 


Each component B;(u,u) is defined for ps-a.e. u. 
These estimates lead to define a weak solution (see 
Albeverio and Cruzeiro (1990)): 


Theorem 1 Let d=2. There exists a flow U(t,w) 
defined on a probability space (Q, F,P) with values 
in H^! for any € > 0, U(-,w) € C(R,H +?) P-a.e. 
w, such that for each component U; we have 


U;(t,w) 
= U;(0,w) + [ B,(U(s,w), U(s,w)) ds, 
0 


P — a.e.w, VIER 


Moreover, the measure ps is invariant under this 


flow. 


We point out that uniqueness is an open problem 
also for d=2. But already in the classical analysis of 
the Euler equations in a bounded domain, unique- 
ness for initial velocity of finite energy is not known. 
Working with the measure jig is even worse, 
especially when d — 3, because its support is a larger 
space within which more irregular velocity vectors 
live. The more irregular the spaces where the flow 
lives, the more difficult is to handle the nonlinear 
term B(u, u). 

On the other hand, for d — 1, the mathematical 
analysis is much easier. For instance, it can be 
proved (see Robert (2003)) that the one-dimensional 
inviscid Burgers equation on the line 


Ou | O f15N _ 


has intrinsic invariant statistical solution, given by a 
class of Lévy's processes with negative jumps. 


The Stochastic Navier-Stokes Equations 


The Navier-Stokes equations describe advection 
with velocity u and diffusion with kinematic 
viscosity v > 0 (see Viscous Incompressible Fluids: 
Mathematical Theory) 


EM plua (uc Vind Vo =o 


Ot 
V.u—0 in D [18] 
u=0 ondD 


where A is the Laplace operator. Nonslip boundary 
conditions are assumed. Although the Euler equa- 
tions [9] are formally obtained from [18] by setting 
v —0, the presence of the second-order operator 
—vA makes the analysis needed to prove the 
existence, uniqueness, and regularity of solutions 
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easier than for the Euler equations. However, at 
variance with the Euler equations, the Navier- 
Stokes equations do not possess invariants, since 
the viscosity dissipates energy. Hence, it is difficult 
to find explicit expressions of invariant measures for 
the deterministic Navier-Stokes equations, except 
the trivial invariant measures concentrated on a 
stationary solution. However, as soon as a stochastic 
force is introduced in these equations, it is possible 
to have nontrivial invariant measures. It is impos- 
sible to review here the wide literature concerning 
the stochastic Navier-Stokes equations and we 
confine ourselves to make some remarks. Most 
results are concerned with proving the existence 
and/or uniqueness of an invariant measure p, with- 
out giving an explicit representation, apart some 
attempts like Gallavotti (2002), where a formal 
representation of stationary distributions is given in 
terms of functional integrals. Some properties of the 
not explicit invariant measures are given like, for 
instance, estimates of moments, exponential conver- 
gence of the statistical solution for large time. 

Stochastic forces can enter in the Navier-Stokes 
equations in different ways. We can consider 
randomness in the forcing term, so that the force f 
in [18] has a deterministic component which 
represents its mean varying slowly and a stochastic 
one, which accounts small fluctuations around the 
mean and varying very rapidly. Alternatively, since 
the molecules are not rigidly connected to one 
another in the fluid, they are subjected to fluctua- 
tions. A complete description of fluctuations relating 
the microscopic and macroscopic motion is not 
achieved at present. However, we shall introduce 
some models for which rigorous mathematical 
results can be proved. 

The first part of this section concerns the Navier— 
Stokes equations with noise »: 


Ou 
op "Aw (u: Viet Vp = 1 119] 
V-u=0 


for which invariant measures exist, one of which can 
be ergodic provided that the noise is suitably chosen. 
In the second part, a Navier-Stokes-type stochastic 
system is described, which has irregular solutions, as 
expected in turbulence. 

Let us introduce the stochastic Navier-Stokes 
equations with time white noise. The first equation 
in [19] is an It6 equation: 


Ou + [—vAu + (u- Vju+ Vp] =0,w [20] 


Here w=Ww(1),...,1W4) is a Brownian motion, that 
is, its time derivative n=0w/0t is a Gaussian 
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stochastic field with zero mean and correlation 
function given by 


Elng (t, x) my (t^, x’) 
= 6.9 (0 — x')6(t — t') [21] 


for f£, R= 1, «sae 

We shall use the differential form for the Itó 
equation [20] always understood in the integral 
form 


u(t) —u(0) + / [-vAu(s) + (u(s) - V)u(s) 
+ Vp(s)|ds = w(t) [22] 


Modeling perturbations by a white noise process 
represents the first step to understand how a random 
perturbation acts in the mathematical equations, 
rather than a good physical or numerical model. The 
first results are in a paper by Bensoussan and 
Temam (1973). 

Obviously, the regularity of the solutions depends 
on the spatial covariance q of the noise. 

Let us consider the following cases. 


e g=6: the noise is white also in space. 


An invariant measure is known explicitly. Indeed, 
assume periodic boundary conditions on the square 
(d—2) or the cube (d—3) D, which makes the 
spatial domain a torus. In this case, the Euler and 
Navier-Stokes equations are set in the same func- 
tional spaces. The generator of the stochastic 
Navier-Stokes equations [20] corresponds to the 
sum of the generator of the Euler equations [9] and 
of the stochastic Stokes equations 


Qu = [vAu = Vpl + Qw 


V.u=0 23] 


Since the first equation in [23] is linear in the 
unknown velocity u, the Stokes system has a unique 
invariant measure which is a centered Gaussian 
measure. In particular, when the noise is a space- 
time white noise and d=2, this is the invariant 
measure [14] of the enstrophy: 


(0), Q9) (du) = Y ws) gy 


Hs 


On a bidimensional torus, it is proved that this 
measure is not only infinitesimally invariant, but 
also globally invariant for a unique flow [20] 
defined for pis ag. initial velocity. We recall 
that initial velocities of finite energy are negligible 
with respect to the measure pa ; 

e g more regular than above, that is, the noise is 


colored in space. 


As soon as the forcing term is more regular in space, 
the Navier-Stokes system has a solution of finite 
energy. These are solutions close to those of the 
deterministic equation. Techniques similar to those 
used to prove the existence and/or uniqueness of 
solutions for the deterministic equations work also 
in the stochastic case with an additive noise (or even 
a multiplicative noise) to get weak or strong 
solutions. Global existence in the space H? is proved 
for d —2,3 and uniqueness only for d — 2, as is the 
case for the deterministic Navier-Stokes equations. 

The interesting feature is that by adding a noise 
which acts on all the components with respect to a 
Hilbert basis (or at least on many components), the 
stochastic Navier-Stokes system has a unique 
invariant measure, which is ergodic. This is proved 
for the spatial dimension d —2. By means of the 
Krylov-Bogoliubov's method, existence of at least 
an invariant measure is proved by compactness of a 
family of averaged measures; the limit measures are 
stationary measures. But, when many modes are 
perturbed by a noise, there is a mixing effect on the 
dynamics, avoiding existence of many stationary 
measures. For the spatial dimension d — 2, the best 
result in this context is in Hairer and Mattingly 
(2004), where the noise acts on very few modes. For 
the spatial dimension d — 3, the result in Da Prato 
and Debussche (2003) shows the existence of an 
invariant measure; even if there is no uniqueness of 
the solutions (as in the deterministic case), by a 
selection principle, they construct a transition 
semigroup, which has a unique invariant measure, 
ergodic and strongly mixing. 

Mathematical proofs are given for very different 
noises. (The reader is urged to consult, among the 
others, the papers by E, Mattingly and Sinai; Flandoli 
and Maslowski; Mikulevicius and Rozovskii; Vishik 
and Fursikov. The latter authors study also statistical 
solutions in two and three dimensions. For a kick noise 
n=}, 6(t — k)qy(x) in equations [19], there are results 
for d — 2 by Bricmont, Kupiainen and Lefevere; Kuksin 
and Shirikyan.) 

We conclude that, as far as invariant measures 
and their ergodicity are concerned, the stochastic 
Navier-Stokes equations have richer results than the 
deterministic Navier-Stokes equations. It is appeal- 
ing to investigate the limit as the intensity of the 
noise goes to zero, so as to recover the deterministic 
equation. Now, think of equation [19] with a noise 
en, for n fixed and ¿>0. Due to the sensitive 
dependence on initial conditions, even a small noise 
may have important effects on the dynamics. A 
conjecture by Kolmogorov is that the unique 
invariant measure u; tends, when £ — 0, to a specific 
measure, the so-called Kolmogorov measure, which 


would enter into the ergodic principle. This is a 
difficult problem, not yet solved. 

We also mention the analysis of the inviscid limit. 
Kuksin (2004) showed that the solution z, of the 
two-dimensional stochastic Navier-Stokes equations 


TOT Oe EA, 


< 
AE O<v<1 [24] 


on the torus converges in distribution to a stationary 
solution of the Euler equations. Here n is a random 
force white in time and smooth in space. More 
precisely, for each subsequence 1,,, 


lim lim u, (T + t) = U(t) [25] 


Y; — —>00 


and almost every trajectory of the nontrivial limit 
process U solves the Euler equations [9] without the 
forcing term. Moreover, the process U keeps 
memory of some features of the noise force n, since 
the mean values of the enstrophy and of the energy 
of U depend on the noise n. 

We now present the second part on stochastic 
models for viscous fluids. In his 1884 paper, 
Reynolds introduced the decomposition of turbulent 
flow into mean and fluctuating flows. The equations 
obtained are difficult to study. We shall show now a 
tractable model for a one-dimensional problem 
(d=1) with a suitable model of fluctuations. 
Decompose the velocity field into the sum of a 
mean flow z and a fluctuation ó 


u-—u-ó 


The fluctuation is assumed to be highly irregular; it 
is reasonable to model it by a stochastic process. If 
we choose 


where b is a given velocity field and dw/dt is white 
noise, then the motion of the fluid is governed by a 
stochastic equation of Itó type. Indeed, the Navier- 
Stokes equations are balance equations of linear 
momentum: 


Du _ 


p; = "Au - Vp [26] 


where Du/Dt is the material time derivative along 
the trajectory of a particle which is at time 7 in 
position x(t) moving with velocity wu (so 


u(x(t)) = (dx/dt)(t)): 


D d o 
TE = 3; ^ x(t)) = E + (u - V ju [27] 
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According to the mathematical model for the 
fluctuation, we have 


dx(t) = u(t, x(t))dt + b(x(t))dw(t) [28] 


Therefore, Du is computed by means of Itó's 
formula 


_ 77 4. Qu 
Du(t,x(t)) = 57 dt + 2 Ge, 400) 
15 D 
= ——— b,b,dt 29 
2 E, je s, DA 22] 
This leads to the stochastic Navier-Stokes-type 
equations (we neglect the overline symbol) 


du + [—vAu + (u - V)u + Vp +4 Qu] dt 
= —(b- Vjudw(t) [30] 
Vel -—0 


where O is the second-order differential operator 
given by the last term in [29]. 

Rigorous mathematical results for the above 
equations have been proved for the one-dimensional 
case, that is, the Burgers equations on the line. 
Given an initial velocity of finite energy uo € H?, 
there exists a unique solution u € C([0, T]; H?) N 
L^(0, T; H!) (P-a.s.). But it can be shown that for a 
more regular initial velocity there is no higher 
regularity of the solution of eqn [30], if b 4 0. This 
means that these stochastic Burgers equations 
cannot have too regular nontrivial solutions, as 
expected in turbulent motion. 


Statistics of Vortices and Bidimensional 
Turbulence 


Onsager (1949) proposed to investigate bidimen- 
sional turbulent flows, extending in a rigorous way 
to hydrodynamics the statistical mechanics approach 
of Boltzmann. If we are interested in flows of finite 
energy, the results of the section “The Euler 
equations" provide no answer to the problem. 
Another way to proceed is by approximating the 
Euler equations in a suitable way. Actually, in a 
two-dimensional turbulent flow, there appears a 
large-scale organization leading to coherent struc- 
tures. These are hydrodynamical vortices, whose 
dynamics is governed by the Euler equations. 
Onsager suggested to approximate the continuous 
Euler equations by a great (but finite) number of 
point vortices. This leads to a finite-dimensional 
Hamiltonian system, to which the methods of 
statistical mechanics can be successfully applied. Of 
course, the crucial point is to pass to the limit, to 
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recover the continuous system. But there are many 
different ways to approximate a continuous vorticity 
by a cloud of point vortices and different approx- 
imations may lead to very different statistical 
equilibrium states. 

We present here the approach presented in Lions 
(1997). To get an idea of a completely different 
approximation, see, for example, Robert (2003). 

Let D be a bounded open smooth simply 
connected subset of R*. Then there exists a function 
yw (the stream function) such that w=V+tw and 
| 55 = 0. Given the velocity u, we recover the stream 
e eá9 by means of the vorticity w = curl u = —Ay, 
so v(x)— fy g(x, y)w(y) dy (here g is the Green's 
eed of the Laplacian —A and x, y are points in 
D). The Euler equations can be written as 


Ow 
A: +u-Vw=0 31] 


— curl u 


Consider now a solution given by vorticity concen- 
trated in a finite number N of points: 


N 
w= M Abri [32] 


i=1 


Here the vortex intensities A; are real values and 

x;(t) are distinct points in D for i— 1,..., N. 
According to the Euler equations, these points evolve 

as follows (see also Marchioro and Pulvirenti (1994)): 


j=1,...,.N [33] 


where g is related to the Green’s function g. This is a 
Hamiltonian system in DN. Hereafter, we shall 
suppose that the vortex intensities are the same 
(A; =A Vi), so that the Hamiltonian is 


z 


N 
Hænan) =5 b» glx; xi) + La xj) [34] 


By means of H, we define the canonical Gibbs 
measure 


pl (dx; dx? -dxn) 
=a e PYM) dy day dx [35] 


where Z(N) is the partition function. If Z(N) < oc, 
then u is a well-defined probability measure on DN 
and, by construction, it is an invariant measure for 


system [33]. We can prove that Z(N) is finite for 
BA € (—8m/N, 47), so that it is natural to choose as 
a scaling GAN = 5. Hence, 


u^ (dxi dx» disi dxy) 


1 


— ——_ e WINK dy. 5 
Z(N) € dx; dx» dxw [36] 


is considered for —8%#< 00, or 8»0 with 
N > B/[4n. 

Bearing in mind the Onsager approach to approx- 
imate the turbulent Euler motion by means of point 
vortices, we are interested in the limit as N goes to 
+00, for 3 fixed in (—8z,+ oc). It turns out that, 
when the number of point vortices becomes very 
large, their statistical behavior corresponds to a very 
large number of independent particles moving in a 
mean force field that they create. i 

More precisely, consider A=1/N,8=ß8. The 
empirical measure 


1 N 
Nd Óx,(1) 


describing the vorticity, weakly converges to a 
probability density p and each correlation function 


= he dena. J dx 7 


fot j= 1,...,N [37] 


weakly converges to " p= IE. | Pl). 
The equation satisfied by p, also called the mean- 
field equation, is 


e (8/N)H 


e BU(x) 
Pa = fpe POO dy 
with U(x) — J g(x, y)p(y) dy [38] 


The relation between U and p can also be written as 
—AU =p in D, U=0 on 0D. We point out that 
4—V-U is a stationary solution of the Euler 
equations. Indeed, w= —AU =p and p is a function 
of U, let us say p=F(U). This gives that 
Vw=VUF'(U) and thus the term u-Vw in the 
Euler equation [31] vanishes. 

It can be proved that there exists a solution of the 
mean-field equation when 3 > 0 or when 3 < 0 and 
D is simply connected. Uniqueness is known in some 
cases, for instance, when D is a bounded open 
smooth simply connected domain and the velocity is 
assumed tangent to the boundary. 

There are numerical evidences of this approxima- 
tion approach (see references in Lions (1997) 
referring to the periodic case). They show that for 


large time and large Reynolds number (viscosity v 
close to 0), the vorticity of the solution of the 
Navier-Stokes equations appears in a simple and 
organized structure. This stays intact until the 
viscous dissipation damps it. The important obser- 
vation is that the organized structure is described 
quite precisely by the solution of the mean-field 
equation for some specific 5. 

Actually, to say that a fluid is inviscid is an 
approximation (which may be justified in many 
contexts), since every fluid displays some kind of 
viscosity. But turbulence is a phenomenon occurring 
at very small viscosity. In this sense, the above result 
provides a description of stationary regime in an 
ideal fluid, which is a good approximation of some 
numerical simulations of real fluids. Besides this 
good agreement with numerical simulations, there is 
no proof on how to deduce the mean-field equation 
from the Euler equations (e.g., which parameter 8 
has to be chosen in eqn [38]?). 


Remark The extension of this analysis to three- 
dimensional flows involves vortex filaments, instead 
of point vortices. There are attempts to describe 
interacting vortex filaments as proposed by Chorin. 
Idealizations of behavior of vortices are introduced 
to have a tractable mathematical model. The reader 
is referred to Lions (1997) for a description of nearly 
parallel vortex filaments and to Flandoli and Bessaih 
(2003) for more realistic filaments which fold. 


See also: Cauchy Problem for Burgers-Type Equations; 
Hamiltonian Fluid Dynamics; Incompressible Euler 
Equations: Mathematical Theory; Malliavin Calculus; 
Non-Newtonian Fluids; Partial Differential Equations: 
Some Examples; Stochastic Differential Equations; 
Turbulence Theories; Viscous Incompressible Fluids: 
Mathematical Theory; Vortex Dynamics. 
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Introduction 


The stochastic Loewner evolution or Schramm- 
Loewner evolution (SLE) is a family of random curves 
that appear as scaling limits of curves or cluster 
boundaries of discrete statistical mechanical models in 
two dimensions at criticality. The stochastic Loewner 
evolution was introduced by Oded Schramm as a 
candidate for the limit of loop-erased random walk 
and the boundary of percolation clusters, and it is now 
believed that SLE curves appear in most planar critical 
systems whose scaling limit satisfies conformal invar- 
iance. The curves are defined by solving a Loewner 
differential equation with a random input. 


Definition 


There are three major one-parameter families of SLE 
curves — chordal, radial, and whole-plane — which 
correspond to curves connecting two boundary points 
in a domain, a boundary point and an interior point in 
a domain, and two points in C, respectively. The 
parameter is usually denoted x > 0. The starting point 
for defining SLE is to write down the assumptions 
that one expects from a scaling limit, assuming that 
the limit is conformally invariant. 

In the chordal case, we assume that there is a 
family of probability measures (up(z,1)), indexed 
by simply connected proper domains D C C and 
distinct boundary points z,w € OD, supported on 
continuous curves 4:[0,7,] — D with 4(0)=z, 
y(t) =w, which satisfies the following: 


e Conformal invariance. If f: D — D' is a con- 
formal transformation, then the image of up(z, w) 
under f is the same as ju (f(z), f(w)), up to a time 


change. 
e Conformal Markov property for  up(z,w). 
Suppose ^5[0,7] is known, and let g, be 


a conformal transformation of the slit domain 
Dwvy[0,7] onto D with g;^(2)—z,gi(w)-—w 
(see Figure 1). Then the conditional distribution 
on g; olt, ty], given y[0,£], is the same, up to a 
change of parametrization, as the original dis- 
tribution. (Implicit in this is the assumption that 
y(t) is on the boundary of D\y(0,t], which will be 
true, e.g., if y is non-self-intersecting and 
y(0, ty) C D.) 


Using the Riemann mapping theorem, one can see 
that such a family (up(z,w)) is determined (up 
to reparametrization) by uy(0, 00), where H = [x + 
iy: y > 0) denotes the upper half-plane. Suppose 
y:[0,0c) — C is a simple (i.e., no self-intersections) 


curve with 4(0)—0, y(0,00) C H, and sup,Im 
[y(t)] 2 oo. Let H; 2 HVy[0,7]. There is a unique 
conformal transformation g;:H; — H whose 
expansion at infinity is 


g(z) =2 +e, O(\z| ^). 


Z — 00 

(see Figure 2). The coefficient b(t), which is some- 
times called the half-plane capacity of y[0,t] and 
denoted hcap|[^[0, t]], is continuous, strictly increas- 
ing, and tending to oc. In fact, 


b(t) = lim y E[Im[X,] | Xo = iy] 
y =00 


where X, denotes a complex Brownian motion and 
T = T0, 1] is the first time s such that X, € R U ^/[0, t]. 
By reparametrizing y, b(t)=2t. With this parame- 
trization, the maps g, satisfy the Loewner differen- 
tial equation 


i 2 
g(z) — Ur’ 


where U:[0,00) — R is a continuous function with 
Up — 0. In fact, U, =g,(y(t)). Schramm observed that 
the measure j14(0, 00), at least if it were supported 
on simple curves and the curves were parametrized 
using half-plane capacity, would produce a random 
U,. If the assumptions above on (up(z,w)] are 
translated into assumptions on the “driving func- 
tion” U;, one shows readily that U, must be a 
driftless Brownian motion, that is, U; = /k B,, for a 
standard one-dimensional Brownian motion Bz. 
Chordal SLE, (in H connecting O and oc) is 
defined to be the random collection of conformal 
maps g; obtained by solving the initial-value problem 


| 2 
ele) = Gen, E -« " 


gr (2) go(z) =z 


gr 


z= 9411) 


Figure 1 The map g: from D'[0, t| onto D. 
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~, 
k gi 


Figure 2 The map g; from H\y[0,t] onto H. 


where B, is a standard one-dimensional Brownian 
motion. Equation [1] is often given in terms of the 
inverse f, — g;!: 
2 
/ 
fa) = -f 8) — =e 
( t i x yr B; 


This equation describes a random evolution of 
conformal maps f, from H into subdomains of H. 
For each z € H, the solution of [1] is defined up to a 
time T, € [0,00] with T, > 0 for z Z 0. For fixed 
t,g, is the unique conformal transformation of 
H,:—(z€ H:T, > t} onto H with expansion 

2t 


| zs +— +., 
g(z) =z A 


Z —> 00 


The chordal SLE, path is the random curve 
y:[0,00) — H such that for each t,H, is the 
unbounded component of HYWy[0,t]. It is not 
immediate from the definition that such a curve y 
exists, but its existence has been proved. If G; = gy», 
then we can write eqn [1] as 


7 a 
— G,(z)+ W; 


where a=2/x and W,;:=—,/KB,;, is a standard 
Brownian motion. Then Z;:— G,(z) + W, satisfies 
the Bessel stochastic differential equation 


G,(z) [2] 


dZ; = 7; dt + dW,, Pin ud [3] 


This equation is valid up to time &T;, which is the 
first time that Z; — 0. 

Although chordal SLE, is defined with a parti- 
cular parametrization, one generally thinks of it as a 
measure on curves modulo reparametrization. The 
scaling properties of Brownian motion imply that 
this measure is invariant under dilations of H. If D 
is a simply connected domain and z, : are distinct 
boundary points of D, chordal SLE, in D connecting 
z and w is defined to be the conformal image of 
SLE, in H from O0 to oo under a conformal 
transformation of H onto D taking 0 to z and oo 
to w. There is a one-parameter family of such 


transformations, but the scale invariance of SLE, in 
H shows that the image measure is independent of 
the choice of transformation. 

The geometric and fractal properties of the curve 
y vary greatly as the parameter « changes: 


e if k <4, y is a simple curve; 

e if 4<« < 8,7 has self-intersections, but is not 
space filling; and 

e if « > 8, y is a space filling curve. 


To see this, one notes that the conformal Markov 
property implies that there can be double points 
with positive probability if and only if T, « oc 
occurs with positive probability for x > 0. In add- 
ition, the curve is space filling if and only if T; « oo 
for all z and T,, 4 T, for w Æ z. The problem is then 
reduced to a problem about the Bessel equation [3] 
for which the following holds: 


e if a > 1/2 and z £0, the probability that T; < oo 
is zero. If a « 1/2, this probability equals 1. 

e if 1/4 <a « 1/2, and w,z are distinct points in H, 
then there is a positive probability that Tw = Tz. 

e if 0<a< 1/4, then with probability 1, T,, 4 T; 
for all w Z z. 


This kind of argument is typical when studying 
SLE - geometric properties of the curve are 
established by analyzing a stochastic differential 
equation. The Hausdorff dimension of the path y 
is given by 


dim[y[0, oc)] = min(1 £2) 


The radial Loewner equation describes the evolu- 
tion of a curve from the boundary of the unit disk 
D= (z:|z| < 1} to the origin. Suppose y:[0,00) — 
D is a simple curve with 4(0) — 1, 7(0,00) C D\{0}, 
and y(t) —^ 0 as t — oo. Let g; be the unique 
conformal transformation of DVy[0, 7] onto D such 
that g,(0) — 0, g/(0) > 0. One can check that g;(0) is 
continuous and strictly increasing in t, and hence we 
can parametrize y in such a way that g/(0)=e’. 
Using this reparametrization, there is a continuous 
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U,:[0,00) — R with Uọ=0 such that g, satisfies 
the radial Loewner equation 


If z Z0, then we can define h,(z) = —1 log g;(z) 
locally near z, and this equation becomes 


h,(z) = cot ES 


Radial SLE, (connecting 1 and 0 in D) is obtained 
by setting U; = vys B;. If D is a simply connected 
domain, z € D,w € ðD, then radial SLE, in D 
connecting w and z is obtained by conformal 
transformation using the unique transformation 
f of D onto D with f(0)=z,f(1)=w. Again, we 
think of this as being defined modulo time change. If 
a — 2/& and v, = b», then 


i,(z) = : cot (ue rm [4] 


where W,:=-— vys Bix is a standard Brownian 
motion. If L? = v,(z) + W,, then we get 


a LX 
dL; -— 2 cot e dt T dW, 


Radial and chordal SLE are closely related. In fact, if 
* is a chordal SLE path in H from 0 to oc, 3 is a 
radial SLE path in D from 1 to 0, and 7 = —i log 3, 
then for small ¢ the distribution of 7 is absolutely 
continuous to the distribution of a (random time 
change of) y. Showing this involves understanding 
the behavior of the Loewner equation under 
conformal transformations. Suppose y, Y have been 
parametrized as in [2] and [4] with a=2/k. Let gř 
be the conformal transformation of H\7[0, zt] onto 
H such that 


4'(t) 


gr) r z — 00 


and let U} be the Loewner driving function such that 


T a' (t) 
z) = ———— 
8 (2) = a) — U; 
Here a*(t) =hcap[nl0,+]]. If we consider a time 
change o such that a*(o(t))=at and let U; = U,4 
be the time-changed driving function, Itó's formula 
can be used to show that 


where the F, in the drift term depends on y[0,t] and 
is independent of a, and W is a standard Brownian 


motion. Girsanov's theorem implies that Brownian 
motions with the same variance but different drifts 
have absolutely continuous distributions. In parti- 
cular, qualitative properties such as existence of 
double points or Hausdorff dimension of paths are 
the same for radial and chordal SLE. U, is a driftless 
Brownian motion if a — 1/3, & — 6. 

Whole-plane SLE, from 0 to oo is a path 
y:(—00,00) — C with 4(—o0) — 0, 4(oo) = oo, such 
that given y(—o0,t], the distribution of y(t,oo) is 
that of radial SLE, from boundary point y(t) to 
interior point oo in the domain C Vy[—oc, t]. One can 
define whole-plane SLE, connecting two distinct 
points in C by conformal transformation. 


Locality and Restriction 


There are two special values of «x : s —6,a = 1/3 that 
satisfies the “locality” property and «= 8/3, a — 3/4 
that satisfies the “restriction” property. Suppose ^ is a 
chordal SLE, curve from 0 to oo in H parametrized 
as in [2]. Suppose ®: NV — H is a conformal map 
taking a neighborhood N of 0 in H to $(A/) and that 
locally maps R into R. Let 7(t) =P o y(t), which is 
defined for sufficiently small f. Let g; be the 
conformal transformation of H'Vj[0,7] onto H with 


a* (t) 
z 


gi (2) =2+ ate aie 


and let U, be the driving function such that 


T a (t) 

£z) == 

gr (z) — Ui 

Here a*(t)=hcap[5[0,+]]. If we change time, 
^t = ^e, So that a*(c(t)) =at, then an application 
of Itó's formula shows that U; :— U, satisfies 


1 $^ (st) 


dU; = = (3a — 1) ; di + dW, 


2 a(t) a(t) 
Here W, is a standard Brownian motion, P, = go 
Po g;!, and g is the conformal map associated to y. 
In particular, if a=1/3,k=6,Uf is a standard 
Brownian motion; hence, 7* has the distribution of 
SLE. The locality property for SLE¿ can be stated 
as “the conformal image of SLE is (a time change 
of) SLE¿.” Intuitively, the SLEg path in a restricted 
domain does not feel the boundary of the domain 
until it reaches it. Radial SLE, satisfies a similar 
locality property. Moreover, [5] can be used to 
show that the image of chordal SLE; under the 
exponential map is the same (for small time 1) as 
radial SLE¿. The locality property explains why 


SLEg is a natural candidate for the boundary of 
percolation clusters. 

If k <4,SLE, paths are simple, that is, with no 
self-intersections. Suppose A C H\{0} is a compact 
set such that H\A is simply connected. Let y denote 
a chordal SLE, in H connecting 0 and oo and 
let Ea be the event Er=(y(0,00)NA=0). Let 
Pa: HA — H be the unique conformal transforma- 
tion with ®,4(0)=0, $4(oo) —oo, P,(00)=1. On 
the event Ea, we can define J(t) = Pa o y(t). Chordal 
SLE, is said to satisfy the restriction property if the 
conditional distribution of 4 given Ea is the same as 
(a time change of) y. The only « € 4 that satisfies 
this property is «=8/3. The proof of this fact also 
establishes the formula: if y is a chordal SLEg/ 
curve in H from 0 to oc, then 


P{7(0,00) n A = 0) = «^, (0) [6] 


There is a similar formula for radial SLEg/5, which 
establishes a radial restriction property. Suppose 
A C D\{0,1} is a compact set such that DIA is 
simply connected. Let V4 be the unique conformal 
transformation of DY A onto D with V,(0), V^, (0) > 0. 
Then, if y is a radial SLEg/3 curve from 1 to 0 in D, 
then 


P(y(0,00) NA = 0) = V^, (04 (1) [7/5 


The restriction property makes SLEgy3 the candidate 
for the scaling limit of self-avoiding walks. 


Relation to Conformal Field Theory 


The Schramm-Loewner evolution is one of the tools 
used to rigorously prove predictions made using 
powerful, yet nonrigorous, arguments of conformal 
field theory. In conformal field theory, there is a 
parameter c, called the central charge, which 
classifies theories. To each c € 1, there corresponds 
a k € 4 and a “dual” &'— 16/k > 4: 


(8 — 3&)(& — 6) 
2K 


In particular, k = 8/3, &' — 6 corresponds to central 
charge zero. It is expected, and has been proved in a 
number of cases, that SLE, or SLE, curves will 
appear in scaling limits of systems with central 
charge c,. These systems can also be parametrized 
by the boundary scaling exponent or conformal 
weight 


ë= En = 


6 —k 
2K 
For k=8/3, a=5/8 which is the exponent in [6]. 


a= i. = 


Stochastic Loewner Evolutions 83 


In studying the relationship between SLE and 
conformal field theories, two other probabilistic 
objects, restriction measures and the (Brownian) 
loop soup, arise. An H-hull (connecting 0 and oo) is 
an unbounded, connected, closed set K C H with 
KMR={0} and such that H\K consists of two 
connected components, one whose boundary 
includes the positive reals and the other whose 
boundary includes the negative reals. A (chordal) 
restriction measure on hulls K is a_ probability 
measure with the property that for any A as in [6], 
the distribution of $4 o K given {KN A=0) is the 
same as the original measure. The (Brownian) loop 
measure is a measure on unrooted loops derived 
from Brownian bridges. It is the scaling limit of the 
measure on random walk loops that gives each 
unrooted simple random walk loop of length 2n 
measure 47", The loop measure in a bounded 
domain is obtained by restricting to loops that stay 
in that domain. We can consider this as a measure 
on “hulls” by filling in the bounded holes (so that 
the complement of the hull is connected). By doing 
this we get a family of infinite measures on hulls, 
indexed by domains D, and this family satisfies 
conformal invariance and the restriction property. 
The loop soup with parameter A is a Poissonian 
realization from this measure with parameter A. 

The set of all restriction measures is parametrized 
by «2 5/8; the a-restriction measure has the 
property that 


P(KnA # 0) = 9^ (0)^ 


For œa — 5/8, K is given by the path of SLEg/3. For 
integer o, the hull K can be constructed by taking 
a-independent Brownian excursions in H (Brownian 
motions starting at 0 conditioned to stay in H for all 
times), and letting K be the hull obtained by taking 
the union of the paths and filling in the bounded 
holes. If & < 8/3,c, € 0, then the restriction mea- 
sure with exponent a, > 5/8 can also be con- 
structed as follows: take a chordal SLE, path and 
an independent realization of the loop soup with 
intensity A, = —c,; add to the SLE path all the 
loops in the soup that intersect the SLE, curve; and 
then fill in all the bounded hulls. The limiting case 
a= 5/8, \=0 gives the only measure supported on 
simple curves that is also a restriction measure, 
SLEg/3. 

For 8/3 <K<4,0<c, <1, it is conjectured, 
and proved for small c,, that SLE, curves can be 
found by taking a loop soup with parameter À= Cw 
and looking at connected curves in the fractal set 
given by the complement of the union of all the hulls 
generated by the loops. 
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Examples 


The scaling limit of simple random walk, Brownian 
motion, is known to be conformally invariant. A 
two-dimensional Brownian bridge or loop is a 
Brownian motion, B,;,0<t< 1, conditioned so 
that Bp = Bı. The frontier or outer boundary of the 
Brownian motion is the boundary of the unbounded 
component of the complement. Benoit Mandelbrot 
first observed numerically that the outer boundary 
of Brownian motion had fractal dimension ~4/3. 
Gregory Lawler, Oded Schramm, and Wendelin 
Werner used SLE to prove that the boundary has 
Hausdorff dimension 4/3. In fact, the outer bound- 
ary can be considered as an SLEg/3 loop. 

SLE¿ and SLEg;; arise in the scaling limit of 
critical percolation on the triangular lattice. Suppose 
that each vertex in the upper half-plane triangular 
lattice is colored black or white each with a 
probability 1/2. Suppose the real line gives a 
boundary condition of black on the negative real 
line and white on the positive real line. Then if we 
represent the vertices in the lattice as hexagons as 
in the figure, a curve is formed which is the 
boundary between the black and white clusters. 
This curve is called the “percolation exploration 
process." Stanislav Smirnov proved that the scaling 
limit of this curve is conformally invariant, and from 
this it can be concluded that the curve is chordal 
SLE. In particular, the Hausdorff dimension is 7/4 
and the scaling limit has double points. In the 
scaling limit, the *outer boundary" of this curve has 
Hausdorff dimension 4/3 and its dimension is 
absolutely continuous with respect to that of 
SLEg/;. While this result is expected for other 
critical percolation model, such as bond percolation 
in Z? with critical probability 1/2, it has only been 
proved for the triangular lattice. Percolation has 
central charge 0 and the “locality” property can be 
seen in the lattice model. The outer boundary of the 
curve has the same distribution as the outer 
boundary of a Brownian motion that is reflected at 
angle 7/3 off the real line. Locally, the outer 
boundary of percolation, the outer boundary of 
complex Brownian motion, and SLEg;; all look the 
same, and it is expected that this will also be true for 
the scaling limit of self-avoiding walks. 

There are three models derived in some way from 
simple random walk that have been proved to have 
scaling limits of SLE,. The loop-erased random walk 
(LERW) in a finite subset V of Z? connecting two 
distinct points is obtained by taking a simple 
random walk from one point to the other and 
erasing loops chronologically. The LERW is closely 
related to uniform spanning trees; in fact, if one 


chooses a spanning tree of V from the uniform 
distribution on all spanning trees, then the distribu- 
tion of the unique path connecting the two points is 
exactly that of the LERW (see Figure 3). Another 
description of the LERW is as the Laplacian random 
walk: the LERW from z to w in V chooses a new 
step weighted by the value of the function that is 
harmonic on the complement of w and the path up 
to that point with boundary values 0 on the path 
and 1 on w. The LERW in the discrete upper half- 
plane can be obtained by erasing loops from a 
simple random walk excursion. The LERW and the 
uniform spanning tree are systems with central 
charge c — —2. It has been proved that the scaling 
limit of the LERW is SLE>; hence, the paths have 
Hausdorff dimension 5/8. 

There is another path associated to spanning trees 
given by the one-to-one correspondence between 
spanning trees and Hamiltonian walks on a corre- 
sponding directed (Manhattan) lattice on the dual 
graph (see Figure 4). If the spanning trees, or 
equivalently the Hamiltonian walks, are chosen 
using the uniform distribution, then the scaling 
limit of this walk is the space-filling curve SLEg. 
Note that 2 and 8 are the dual values of x associated 
to c — —2. 


Figure 3 A spanning tree and the path between two vertices. 
If the tree has the uniform distribution, the path has the 
distribution of the LERW. 


Figure 4 A spanning tree and the corresponding Hamiltonian 
walk. 
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Another discrete process derived from simple 
random walk, the harmonic explorer, has a scaling 
limit of SLE4. There is a particular property of SLE, 
that leads to the definition of this discrete process. 
Consider a chordal SLE, curve, let z € H, and let Z? 
be as in [3] with a=2/k. Itó's formula shows that 
O, :— arg(Z:) satisfies 


1 sin(20,) sin O, 
se, = (5-4) PER dt- dw 
i iz: p > 


In particular, O, is a martingale if and only if 
a=1/2,x=4. The probability that a complex 
Brownian motion starting at z € H first hits R on 
the negative half-line can be shown to be arg (z). If 
k <4, then we can see that O, equals 0 or 7, 
depending on whether z is on the right or left side 
of the path 4(0, o6). For the martingale case & — 4, 
O, represents the probability that z is on the left 
side of (0,00), given y(0,t]. The harmonic 
explorer is a process on the hexagonal lattice 
defined to have this property. In a way similar to 
the percolation process, the walk is defined as the 
boundary between black and white hexagons on 
the triangular lattice. However, when an unex- 
plored hexagon is reached in the harmonic 
explorer, it is colored black with probability q, 
where q is the probability that a simple random 
walk on the triangular lattice starting at that 
hexagon (considered as a vertex in the triangular 
lattice) hits a black hexagon before hitting a white 
hexagon. It is not difficult to show that this process 
has the property that for z away from the curve, 
the *probability of z ending on the left given the 
curve of steps” is a martingale. 

There are many other models for which SLE, 
curves are expected in the limit, but it has not been 
established. The most difficult part is to show the 
existence of a limit that is conformally invariant. 
One example is the self-avoiding walk (SAW). It is 
an open problem to establish that there exists a 
scaling limit of the uniform measure on SAWs and 
to establish conformal invariance of the limit. 
However, the nature of the discrete model is such 
that if the limit exists, it must satisfy the restriction 
property. Hence, under the assumption of confor- 
mal invariance, the only possible limit is SLEgy3. 
Numerical simulations strongly support the con- 
jecture that SLEg/3 is the limit of SAWs, and this 
gives strong evidence for the conformal invariance 
conjecture for SAWs. Critical exponents for SAWs 
(as well as critical exponents for many other 
models) can be predicted nonrigorously from 
rigorous scaling exponents for the corresponding 
SLE paths. 


Generalizations 


One of the reasons that the theory of SLE is nice for 
simply connected domains is that a simply connected 
domain with an arc connected to the boundary of the 
domain removed is again simply connected. For 
nonsimply connected domains, it is more difficult to 
describe because the conformal type of the slit 
domain changes as time evolves. In the case of a 
curve crossing an annulus, this can be done with an 
added parameter referring to the conformal type of 
the annulus (two annuli of the form {z:r; < |z| < sj} 
are conformally equivalent if and only if 
ri/si = r2/s2). It is not immediately obvious what 
the correct definition of SLE should be in general 
domains and, more generally, on Riemann surfaces. 
One possibility for x < 4 is to consider a configura- 
tional (equilibrium statistical mechanics) view of 
SLE. Consider a family of measures (pp(z,w)), 
where D ranges over domains and z, w are distinct 
boundary points at OD is locally analytic, supported 
on simple curves from z to w (modulo time change). 
Let use, w) = up (z, 10)/lup(z, w)| be the correspond- 
ing probability measures, which may be defined even 
if OD is not smooth at z,w. Then the following 
axioms should hold: 


e Conformal invariance. If f : D — D' is a confor- 
mal transformation, f o us, w) = uf), f(w)). 

e Conformal Markov property. 

e Perturbation of domains. Suppose Dı C D and 
0D1,0D agree near z, w. Then pip,(z,w) should 
be absolutely continuous with respect to up(z, w). 
Let Y denote the Radon—Nikodym derivative of 
up, (z, w) with respect to up(z, w). Then 


Y(y) = 1{7(0,t,) C Di} Fe(D; y, D\D1) 


where F, is to be determined. In the case where 
D,D; are simply connected, F.(D;y,D\Dj,)= 
J(y, D, D4) *, where J(y, D, Dı) denotes the prob- 
ability that there is a loop in the Brownian loop 
soup in D that intersects both y and DV Dj. (There 
is no problem defining this quantity in nonsimply 
connected domains, but it is not clear that it is the 
right quantity.) Here c= c,. The restriction property 
tells us that Fo = 1. 


e Conformal covariance. If f is as above, OD, OD' are 
smooth near z, w and f(z), f (w), respectively, then 


fo up(z, w) = If'(z)l Fe) "uo (F(z), f (w)) 


Here a= o, is the boundary scaling exponent. 


See also: Boundary Conformal Field Theory; Percolation 
Theory; Random Walks in Random Environments. 
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Introduction 


The concept of stochastic resonance was introduced 
by physicists. It originated in a toy model designed 
for a qualitative description of periodicity phenom- 
ena in the recurrences of glacial eras in Earth’s 
history. It spread its popularity over numerous areas 
of natural sciences: neuronal response to periodic 
stimuli, variations of magnetization in a ferromag- 
netic system, voltage variations in the simple Schmitt 
trigger electronic circuit or in more complicated 
devices, behavior of lasers in optical bi-stability, etc. 
The interest in this ubiquitous phenomenon is 
enhanced by signal analysis: an optimal dose of 
noise in some system can essentially boost signal 
transduction. Noise in this context does not enter the 
system as an impurity perturbing its performance, but 
on the contrary as a catalyst triggering amplified 
stochastic response to weak periodic signals. 


The Climate Paradigm 


The phenomenon of stochastic resonance was first 
discovered in an elementary climate model serving in 
an explanation of major transitions in paleoclimatic 
time series confining glacial cycles. Data collected 
for instance from ice or deep sea cores allow one to 
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deduce estimates of the average temperature on 
Earth over the last 700000 years. They exhibit 
periodic switching between ice and warm ages with 
fast spontaneous transitions. The average periodicity 
of the glaciation time series obtained is ~10° years. 
In order to explain temperature variations, Benzi 
et al. (1981) introduced random perturbations into 
an energy balance model of the Budyko-Sellers type. 
This model describes the evolution of the seasonal 
and global average temperature X caused by defects 
in the balance between incoming and outgoing 
radiation 
C d um Ein = Eout 
where c is the active thermal inertia of the system. 
The incoming energy is modeled as proportional 
to the “solar constant” O: 
Ent ; 

a o(1 +Acos IT), with T ~ 92000 years 
and A=0.1% of O. This exceedingly small varia- 
tion of the solar constant is caused by a modulation 
of the orbital eccentricity of the Earth's trajectory 
(Figure 1). The outgoing radiation Eout is composed 


GIO 


Figure 1 Modulation of the orbital eccentricity. 


of two essential parts. The first part a(X)E;, is 
dominated by the albedo a(X) representing the 
proportion of energy reflected back to space. It is a 
decreasing function of temperature, due to the 
higher rate of reflection from a brighter Earth at 
low temperatures implying a bigger volume of ice. 
The second part of the outgoing radiation comes 
from the fact that the Earth radiates energy like a 
black body, and is given by the Boltzmann law 7X‘, 
where y is the Stefan constant. Describing the 
balance of energy terms as a slowly and weakly 
time-varying gradient of a potential U, the balance 
model can be expressed by 


“ae a FXO) 


- = 


where the time period 1 is blown up to (large) T by 
time scaling. The roles of deep and shallow wells 
switch periodically (Figure 2). Since the variation of 
the solar constant is extremely small, we can assume 
that the height of the barrier between the two wells 
is lower-bounded by a positive constant. The system 
then admits three steady states two of which are 
stable and separated by roughly 10K. As the solar 
constant, they fluctuate slowly and very weakly. 
Therefore, this deterministic system cannot account 
for climate changes with temperature variations of 
~10K. They can only be explained by allowing 
transitions between the two steady states which 
become possible by adding noise to the system. In 
general, short timescale phenomena such as annual 
fluctuations in solar radiation are modeled by 
Gaussian white noise of intensity £ and lead to 
equations of the type 


t 


ix = a 


Xr) dt + VedW, [1] 


which are generic for studying stochastic resonance 
in numerous physical and biological models. Gen- 
erally, the input of noise amplifies a weak periodic 
signal by creating trajectories fluctuating randomly 
periodically between meta-stable states. An optimal 
tuning of noise intensity to period length (“stochas- 
tic resonance") significantly enhances the response 


Figure 2 Deep and shallow wells switching periodically. 
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of the random system to weak perturbations with 
long periods. 


Strongly Damped Brownian Particle 


It is useful to roughly compare solutions of 
stochastic differential equations and motions of 
Brownian particles in  double-well landscapes 
(Figure 3) in order to understand properties of 
their trajectories (see Schweitzer 2003, Mazo 2002). 
As in the previous section, let us concentrate on a 
one-dimensional setting, remarking that we shall 
give a treatment that easily generalizes to the finite- 
dimensional setting. Due to Newton's law, the 
motion of a particle is governed by the impact of 
all forces acting on it. Let us denote F the sum of 
these forces, m the mass, x the space coordinate, and 
v the velocity of the particle. Then 


mv =F 


Let us first assume the potential to be switched off. 
In their pioneering work at the turn of the 
twentieth century, Marian v. Smoluchowski and 
Paul Langevin introduced stochastic concepts to 
describe the Brownian particle motion by claiming 
that at time t 


F(t) = —youv(t) + y 2kg To W, 


The first term results from friction yo and is velocity 
dependent. An additional stochastic force represents 
random interactions between Brownian particles and 
their simple molecular random environment. The 
white noise W (formal derivative of the Wiener 
process) plays the crucial role. The diffusion coefficient 
(standard deviation of the random impact) is com- 
posed of Boltzmann's constant kg, friction, and 
environmental temperature T. It satisfies the condition 
of the fluctuation-dissipation theorem expressing the 
balance of energy loss due to friction and energy gain 
resulting from noise. The equation of motion becomes 


dv(t)= — E v(t) dt + ca dW, 


\ 


Figure 3 Brownian particle in a double-well landscape. 


88 Stochastic Resonance 


2 

1 
4 

0 
0 
sid A 
>. m 

0 T 2T 3T 4T E 


Figure 4 Resonance pictures for diffusions. 


In the stationary regime, the stationary Ornstein- 
Uhlenbeck process provides its solution 


v(t) =v(0) e (10/m)t FS Veto f tmine dW, 
m 0 


The ratio 8:— yo/m determines the dynamic behav- 
ior. Let us focus on the over-damped situation with 
large friction and very small mass. Then for 
t>>1/3=r7 (relaxation time), the first term in the 
expression for velocity can be neglected, while the 
stochastic integral represents a Gaussian process. By 
integrating, we obtain in the over-damped limit 
(8 —5 oo) that v and thus x is Gaussian with almost 
constant mean 


1—e 


m(t) =x(0) + 3 


v(0) =x(0) 


and covariance close to the covariance of white 
noise see Nelson (1967): 


K(s,t) - 2T min(e, 1) 4 7 (2.420 4 207% 
‘YO Yop 


— e Alt-s| _ ¿Bl(t+ 5) 


" 2kgT 
YO 


min(s, £) 


Hence, the time-dependent change of the velocity of 
the Brownian particle can be neglected, the velocity 
rapidly thermalizes (ù +0), while the spactial coor- 
dinate remains far from equilibrium. In the so-called 
adiabatic transformation, the evolution of the 
particle’s position is thus given by the transformed 
Langevin equation 


dx(t) = d dW, 
0 


Let us next suppose that we have a Brownian 
particle in an external field of force (see Figure 3), 


2T 


generating a potential U(t,x). This leads to the 
Langevin equation 


dx 
EC 


m dv(t) = —yo v(t)dt — 5 (t, x(t)) + v2kgTyo dW, 


In the over-damped limit, after relaxation time, the 
adiabatic elimination of the fast variables (Gardiner 
2004) leads to an equation similar to the one 
encountered in the previous section: 


dre 2% te uii 4 OO aw, 
Yo Ox Yo 


In the particular case of some double-well potential 
x — U(t,x) with slow periodic variation, the follow- 
ing patterns of behavior of the solution trajectories 
will be experienced. If temperature is high, noise has 
a predominant influence on the motion, and the 
particle often crosses the barrier separating the two 
wells during one period. The behavior of the particle 
does not seem to be periodic but rather chaotic. If 
temperature is small, the particle stays for a very 
long time in the starting well, fluctuating weakly 
around the equilibrium position. It has too low 
energy to follow the periodic variation of the 
potential. So in this case too, the trajectories do 
not look periodic. Between these two extreme 
situations, there exists a regime of noise intensities 
for which the energy transmitted by the noise is 
sufficient to cross the barrier almost twice per 
period. The parameters are then near to the 
resonance point and the motion exhibits periodic 
switching (Figure 4). 


Transition Criteria 
and Quasideterministic Motion 


Studying stochastic resonance accordingly means 
looking for the range of regimes for which periodic 
behavior is enhanced and eventually optimal. The 
optimal relation between period T and noise 


intensity £ emerges in the small noise limit. To 
explain this, let us focus on the basic indicator for 
periodic transitions — the time the Brownian particle 
needs to exit from the starting well, say the left one. 
In the “frozen” case, that is, if the time variation of 
the potential term is eliminated just by freezing it at 
some time s, the asymptotics of the exit time is 
derived from the classical large deviation theory of 
randomly perturbed dynamical systems (see Freidlin 
and Wentzell 1998). Let us assume that U is locally 
Lipschitz. We denote by D, (resp. D,) the domain 
corresponding to the left (resp. right) well and x 
their common boundary. The law of the first exit 
time 75, = inf (t > 0, X? € Di] is described by some 
particular functional related to large deviation. For 
t7 0, we introduce the “action functional” on the 
space of continuous functions C([0,7]) on [0,7] by 


tf. aU / 2 ee i 
S (o = 3 Io (Px +2 (s pu)) du, if ọ is abs. 
(P) = continuous 
+00 otherwise 


which is non-negative and vanishes on the set 
of solutions of the ordinary differential equation 
x= —(OU/Ox)(s, x). Let x and y € R. In relation with 
the action functional, we define the quasipotential 


Vs(x, y) = inf(S; (y): e EC([0,t]), po =x, p: =y,1> 0) 


It represents the minimal work the diffusion starting 
in x has to do in order to reach y. To switch wells, 
the Brownian particle starting in the left well’s 
bottom x; has to overcome the barrier. So we let 


V, = inf V(x, y) 
y€x 


This minimal work needed to exit from the left well 
can be computed explicitly, and is seen to equal to 
twice its depth. The asymptotic behavior of the exit 
time is expressed by 


lim eln E[r5.] = Vs 
and 


lim P, (eve T. TD, « aa) zz 
for any 6 > 0 


The prefactor for the exponential rate, derived by 
Freidlin and Wentzell (1998), was first given by 
Eyring and Kramers and then by Bovier et al. (2004). 
Let us now assume that the left well is the deeper 
one at time s. If the Brownian particle has enough 
time to cross the barrier, that is, if T > eV*/, then 
whatever the starting point is, Freidlin (2000) proved 
that it should stay near x, in the following sense: 


A(t€ [0,1] : X — j| > 6) 50 
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in probability as e — 0. Here A denotes Lebesgue 
measure on R. If T < eVs/*, the time left is not long 
enough for crossings: the particle stays in the 
starting well, near the stable equilibrium point: 


A(t € (0, 1] : IX? = (xil ep + Xd ep) > 6)—0 


This observation is at the basis of Freidlin’s law of 
quasideterministic periodic motion discussed in the 
subsequent section. The lesson it teaches is this: to 
observe switching of the position to the energetically 
most favorable well, T should be larger than some 
critical level eV*. Measuring time in exponential 
scales by y through the equation T*=e*/*, the 
condition becomes y > A. 


Stochastic Resonance for Landscapes, 
Frozen on Half-Periods 


This particular case has analytical advantages, since 
it allows one to employ classical techniques of 
semigroup and operator theory. The situation is the 
following: let U be a double-well potential with 
minima x;— —1 and x,=1 and a saddle point at 
the origin. We assume that U(x)— oo as |x| — oc 
and U(—1)=—V/2=-—V,/2, U(1) =—w/2=-—V./2, 
U(0)—0, and 0 «v « V. We define the 1-periodic 
potential by U(t,x) — U(t+ 1/2, —x). Hence on each 
half-period the corresponding diffusion is time homo- 
geneous. The critical level A is then easily defined by 
A —v, that is, twice the depth of the shallow well. By 
letting 


| —1 for te[k,k +) 
p(t) = i 

1 forte|kR+5,k+1), k=0,1,2,,,, 
the periodic function which describes the location of 
the global minimum of the potential, we get in the 
small noise limit 


A(t € [0, 1] : |X}, — d(t)| > 6) >0 


in probability as £—>0. This result expresses 
Freidlin’s law of quasideterministic motion: for 
large periods, the trajectories of the particle 
approach a periodic deterministic function. But the 
sense in which this notion measures periodicity does 
not take into account that for large periods short 
excursions to the wrong well may occur in an erratic 
way without counting much for Lebesgue measure 
of time. In fact, if the period is too large, that is, 
u > V, the time available in one period permits the 
exit of not only the shallow well but also that of the 
deep well. So, whatever the starting position of 
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the particle is, the number of observed transitions in 
one half period becomes very large. Indeed the first 
time € the particle starting in x, hits again xı after 
visiting the position x, satisfies 


E(£) — e" +e" < T =e 


The motion of the particle appears more chaotical 
than periodic: noise intensity is too large compared 
to period length. We avoid this range of chaotic 
spontaneous transitions by defining the resonance 
interval Ip =|v, V], as the range of admissible energy 
parameters jz for randomly periodic behavior. In this 
regime, the trajectories possess periodicity proper- 
ties. In these terms the resonance point describes the 
tuning rate ur € Ig for which the stochastic response 
to weak external periodic forcing is optimal. To 
make sense, this point has to refer to some measure 
of quality for periodicity of random trajectories. In 
the huge physics literature concerning resonance, 
two families of criteria can be distinguished. The 
first one is based on invariant measures and spectral 
properties of the infinitesimal generator associated 
with the diffusion X*. Now, X* is not Markovian 
and consequently does not admit invariant mea- 
sures. But by taking into account deterministic 
motion of time in the interval of periodicity and 
considering the process Z,=(t mod(T*), X,), we 
obtain a Markov process with an invariant measure 
v,(x)dx. In other words, the law of X, ~ v,(x)dx and 
the law of Xr ~ v, r(x)dx, under this measure, 
are the same for all + > 0. Let us present the most 
important ones: 


e the spectral power amplification. (SPA) which 
plays an eminent role in the physics literature 
describes the energy carried by the spectral 
component of the averaged trajectories of X* 
corresponding to the period: 


l 
Mspa (e, T) = / E,[X*, Je?" ds 
0 


0 T 2T 3T 
Figure 5 Resonance pictures for Markov chain. 


e the SPA-to-noise ratio, giving the ratio of the 
amplitude of the response and the noise intensity, 
which is also related to the signal-to-noise ratio: 


Mspx(e, T) = Mspa(e, T)/&? 


e the total energy of the averaged trajectories 
i 2 
0 


The second family of criteria is more probabilistic. 
It refers to quality measures based on transition 
times between the domains of attraction of the local 
minima, residence times distributions measuring the 
time spent in one well between two transitions, or 
interspike times. This family is certainly less popular 
in the physics community. 

However, measures related to invariant measures 
may suffer from robustness deficiency (Imkeller and 
Pavlyukevich 2002). To explain what we mean by 
robustness, let us introduce a model reduction first 
discussed by McNamara and Wiesenfeld (1989). 
Instead of studying the diffusion X* in the double- 
well landscape, they introduce a two-state Markov 
chain Y* (Figure 5) the dynamics of which just 
takes account of the domain of attraction the diffusion 
is in, and therefore with state space {—1,1}. A 
reasonable choice of the infinitesimal generator should 
retain the dynamics of the diffusion’s transitions 
characterized by Kramers’ rate. We may take 


yp —v 
Q(t) = ( i ) Tapai 
p —€ 2 


periodically continued on R,. Here, o — pe V/* and 
ip — qe". The prefactors of subexponential order 
are beyond the scope of large deviation theory. They 
are related to the curvature of the potential in the 


minima and the saddle point of the landscape and 
given by 


U"(—1)|U"(0)| 


q=5- VUO 
On the intervals [kT/2,(k+1)T/2[,k > 0, the 
Markov chain Y* is time-homogeneous and its 
transition probabilities can be expressed in terms 
of y and v. For instance, the probability with which 
the chain jumps from state —1 to state +1 in the time 
window |[t,t+h]| equals yh+o(h), if this time 
interval is contained in [RT /2,(k 4- 1)T/2| for 
some even k. The stationary measure of the Markov 
chain denoted by v can be explicitly calculated, and 
so can the classical quality measures based on the 
spectral notions. For instance, the spectral power 
amplification coefficient equals 


i 2 
Mspa(e, T) = f EY l ds 
Jo 
_4  T(p-yy 
T (p +p) T? 4 v? 


This simple expression admits asymptotically a 
unique maximum which exhibits the resonance 
point: 


> 7T U r E 
TE = (V+v)/2e ] 4 -(V-v)/e 
a IN ¥—0" f + O(e )} 


The optimal period is then exponentially large — as 
was suggested by large deviation theory — and the 
growth rate is the sum of the two wells' depths. The 
simple Markov chain model is popular since the 
usual physical quantities are easy computable and 
since it is believed to mimic the dynamics of a 
Brownian particle in the corresponding double-well 
landscape. However, the models are not as similar 
as expected (Freidlin 2003). Indeed, in a reasonably 
large time window around the resonance point for 
Y", the tuning picture of the spectral power 
amplification for the diffusion is different. Under 
weak regularity conditions on the potential, it 
exhibits strict monotonicity in the window. Hence, 
optimal tuning points for diffusion and Markov 
chain differ essentially. In other words, the SPA 
tuning behavior of the diffusion is not robust for 
passage to the reduced model. This strange defi- 
ciency is difficult to explain. The main reason of this 
subtle effect appears to be that the diffusive nature 
of the Brownian particle is neglected in the reduced 
model. In order to point out this feature, we may 
compute the SPA coefficient of g( X^), where g is a 
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particular function designed to cut out the small 
fluctuations of the diffusion in the neighborhood of 
the bottoms of the wells, by identifying all states 
there. So g(x) = —1 (resp. 1) in some neighborhood 
of —1 (resp. 1) and otherwise g is the identity. This 
results in 


~ l " 2 
Mspa(e, T) = n E,g(X1)e^"5 ds 
Jo 


In the small noise limit this quality function admits a 
local maximum close to the resonance point of the 
reduced model: the growth rate of T$ is also given 
by the sum of the wells’ depths. So the lack of 
robustness seems to be due to the small fluctuations 
of the particle in the wells’ bottoms. In any case, this 
clearly calls for other quality measures to be used to 
transfer properties of the reduced model to the 
original one. Our discussion indicates that due to 
their emphasis on the pure transition dynamics, the 
second family of quality measures should be used. 
For these notions there is no need to restrict to 
landscapes frozen in time-independent potential 
states on half-period intervals. 


Stochastic Resonance for Continuously 
Varying Landscapes 


From now on the potential U(t, x) is supposed to be 
continuously varying in (t,x). For simplicity, its 
local minima are assumed to be located at +1, and 
its only saddle point at 0, independently of time. So 
the only meta-stable states on the whole time axis 
are +1. Let us denote by A (f£) (resp. A,(t)) the 
depth of the left (resp. right) well at time t. Together 
with U, these functions are continuous and 
l-periodic. Assume that they are strictly monoto- 
nous between their global extrema. Let us now come 
back to the motion of a Brownian particle in this 
landscape. The exit time law by Eyring-Kramers- 
Freidlin entails that trajectories get close to the 
global minimum, if the period is large enough. 
Stated as before in exponential rates T — e", with 
p > max;¡- + Sup, 2A+(£), that is, y exceeds the 
maximal work needed to cross the barrier, the 
particle often switches between the two wells and 
should stay close to the deepest position in the 
landscape. This position being described by the 
function @(t) = 21,4, (1>a (1 — 1, we get in the small 
noise limit 


A(t € [0,1] : |X5, — ¢(t)| > 6) >0 


in probability. But on these long timescales, many 
short excursions to the wrong well are observed, and 
trajectories look chaotical instead of periodic. So we 


92 Stochastic Resonance 


have to look at smaller periods even at the cost that 
the particle may not stay close to the global 
minimum. Let us study the transition dynamics. 
Assume that the starting point is —1 corresponding 
to the bottom of the deep well. If the depth of the well 
is always larger than u= € log T*, the particle has too 
little time during one period to climb the barrier, and 
should stay in the starting well. If, on the contrary, 
the minimal work to leave the starting well, given by 
24A. (s), becomes smaller than y at some time s, then 
the transition can and will happen. More formally, 
for pu €[infj3o 2A (t), sup, 2A_(t)], we define 
(Figure 6). 7 


a, (s) =inf{t >s:2A_(t) € uj 


The first transition time from —1 to 1 denoted 7, 
has the following asymptotic behavior as 
€ — 0: T. /T* —a, (0). At the second transition the 
particle returns to the starting well. If a is defined 
analogously with respect to the depth function A4, 
this transition will occur near the deterministic time 
a, (a, (s))T* . In order to observe periodicity, and to 
exclude chaoticity from all parts of its trajectories, 
the particle has to stay for some time in the other 
well before returning. This will happen under the 
assumption 2A, (2,(0)) > y, that is, the right well is 
the deep one at transition time. In fact, we can 
define the resonance interval Ip (Figure 7), as the set 
of all scales y for which trajectories exhibit 
periodicity in the small noise limit, by 


ik = | max inf 2A,(t), inf max 2A,(¢)] 


i=+ t>0 ‘20 = 


a, (0) t 


Figure 6 Definition of a). 


Figure 7 Resonance interval. 


On this interval they get close to deterministic 
periodic ones. Again, periodicity is quantified by a 
quality measure, to be maximized in order to obtain 
resonance as the best possible response to periodic 
forcing. One interesting measure is based on the 
probability that random transitions happen in some 
small time window around a deterministic time, in 
the small noise limit (Herrmann and Inkeller 2005). 
Formally, for h > 0, the measure gives 


M,(e, T) = min P;(r-/T* € [d - b, a, + b]) 


where P; is the law of the diffusion starting in 7. In 
the small noise limit, this quality measure tends to 1, 
and optimal tuning can be related to the exponential 
rate at which this happens. This is due to the 
following large deviations principle: 


lim elog(1 — My(e, T)) = max — 2 Ai(d,, —h)} 


for i €lIg, with uniform convergence on each 
compact subset of Ip. The result is established 
using classical large-deviation techniques applied to 
locally time homogeneous approximations of the 
diffusion. Maximizing the transition probability in 
the time window position means minimizing the 
default rate obtained by the large deviations 
principle. This can be easily achieved. In fact, if the 
window length 2h is small, then y — 2A;(a', — b) = 
2hAi(a',), since 2A;(a),) = u by definition. The value 
A;(a;,) is negative, so we have to find the position 
where its absolute value is maximal. In this position 
the depth of the starting well has the most rapid 
drop under the level p, characterizing the link 
between the noise intensity and the period. So the 
transition time is best concentrated around it. 

It is clear that a good candidate for the resonance 
point is given by the eventually existing limit of the 
global minimizer ur (hb) as the window length þh tends 
to 0. This limit is therefore called the resonance 
point of the diffusion with time-periodic landscape 
U. Let us note that for sinusoidal depth functions 
| "cts W—z 


4 7 4 


A (1) cos(27t) 


and 
A, (t) = A_(t+7) 


the optimal tuning is given by T* = exp ug /& with 
ur — (v + V)/2. This optimal rate is equivalent to 
the optimal rate given by the SPA coefficient of the 
reduced dynamics’ Markov chain in the preceding 
section. 

The big advantage of the quality measure M, is its 
robustness. Indeed, consider the reduced model 


consisting of a two-state Markov chain with 
infinitesimal generator 


_(-9(t) wt) 
a0 = (A o) 

where  o(t)— exp 2A (t/T)/e and  w(t)— 
exp 2A,(t/T)/e. The law of transition times of 
this Markov chain is readily computed from Laplace 
transforms. Normalized by T* it converges to di. 
This calculation even reveals a rigorous underlying 
pattern for the second- and higher-order transition 
times interpreting the interspike distributions of 
the physics literature. The dynamics of diffusion 
and Markov chain are similar. Resonance points 
provided by M, for the diffusion and its analog for 
the Markov chain agree. 


Related Notions: Synchronization 


In the preceding sections, we interpreted stochastic 
resonance as optimal response of a randomly 
perturbed dynamical system to weak periodic forcing, 
in the spirit of the physics literature (see Gammaitoni 
et al. (1998)). Our crucial assumption concerned the 
barrier heights a Brownian particle has to overcome 
in the potential landscape of the dynamical system: it 
is uniformly lower bounded in time. Measures for the 
quality of tuning were based on essentially two 
concepts: one concerning spectral criteria, with the 
spectral power amplification as most prominent 
member, the other one concerning the pure transi- 
tions dynamics between the domains of attraction of 
the local minima. A number of different criteria can 
be used to create an optimal tuning between the 
intensity of the noise perturbation and the large 
period of the dynamical system. The relations have to 
be of an exponential type T= exp /e, since the 
Brownian particle needs exponentially long times to 
cross the barrier separating the wells according to the 
Eyring-Kramers-Freidlin transition law. Our barrier 
height assumption seems natural in many situations, 
but can fail in others. If it becomes small periodically, 
and eventually scales with the noise-intensity para- 
meter, the Brownian particle does not need to wait an 
exponentially long time to climb it. So periodicity 
obtains for essentially smaller timescales. In this 
setting, the slowness of periodic forcing may also be 
assumed to be essentially subexponential in the noise 
intensity. 

If it is fast enough to allow for substantial changes 
before large deviation effects can take over, we are 
in the situation of Berglund and Gentz (2002). They 
in fact consider the case in which the barrier 
between the wells becomes low twice per period, 
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to the effect of modulating periodically a bifurcation 
parameter: at time zero the right-hand well becomes 
almost flat, and at the same time the bottom of the 
well and the saddle approach each other; half a 
period later, a spatially symmetric scenario is 
encountered. In this situation, there is a threshold 
value for the noise intensity under which transitions 
become unlikely. Above this threshold, the trajec- 
tories typically contain two transitions per period. 
Results are formulated in terms of concentration 
properties for random trajectories. The intuitive 
picture is this: with overwhelming probability, 
sample paths will be concentrated in spacetime sets 
scaling with the small parameters of the problem. In 
higher dimensions, these sets may be given by 
adiabatic or center manifolds of the deterministic 
system, which allow model reduction of higher- 
dimensional systems to lower-dimensional ones. 
Asymptotic results hold for any choice of the small 
parameters in a whole parameter region. A passage 
to the small noise limit as for optimal tuning in the 
preceding sections is not needed. 

Related problems studied by Berglund and Gentz 
in the multidimensional case concern the noise- 
induced passage through periodic orbits, where 
unexpected phenomena arise. Here, as opposed to 
the classical Freidlin-Wentzell theory, the distribu- 
tion of first-exit points depends nontrivially on the 
noise intensity. Again aiming at results valid for 
small but nonvanishing parameters in subexponen- 
tial scale ranges, they investigate the density of first- 
passage times in a large regime of parameter values, 
and obtain insight into the transition from the 
stochastic resonance regime into the synchronization 
regime. 


See also: Dynamical Systems in Mathematical Physics: 
An Illustration from Water Waves; Magnetic Resonance 
Imaging; Spectral Theory for Linear Operators; 
Stochastic Differential Equations. 
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Introduction 


String field theory (SFT) is the second-quantized 
approach to string theory. In the usual, first- 
quantized, formulation of string perturbation the- 
ory, one postulates a recipe for the string S-matrix in 
terms of a sum over two-dimensional (2D) world 
sheets embedded in spacetime. Very schematically, 


(V3 (1) . .. Vn(Rn))) 
a Y gn J idi] (Vi (kı) Valli — [1] 


topologies 


Here the left-hand side stands for the S-matrix of the 
physical string states {V,(k,)}. The symbol (...),, 
denotes a correlation function on the 2D world sheet, 
which is a punctured Riemann surface of Euler number 
x and given moduli {4a}. In SFT, one aims to recover 
this standard prescription from the Feynman rules of a 
second-quantized spacetime action S|]. The string 
field ®, the fundamental dynamical variable, can be 
thought of as an infinite-dimensional array of space- 
time fields (ó'(x")), one field for each basis state in the 
Fock space of the first-quantized string. 

The most straightforward way to construct S[4]| 
uses the unitary light-cone gauge. Light-cone SFT is 
an almost immediate transcription of Mandelstam's 


light-cone diagrams in a second-quantized language. 
While often useful as a bookkeeping device, light- 
cone SFT seems unlikely to represent a real 
improvement over the first-quantized approach. By 
contrast, from our experience in ordinary quantum 
field theory, we should expect Poincaré-covariant 
SFTs to give important insights into the issues of 
vacuum selection, background independence and the 
nonperturbative definition of string theory. 

Covariant SFT actions are well established for the 
open (Witten 1986), closed (Zwiebach 1993) and 
open/closed (Zwiebach 1998) bosonic string. These 
theories are based on the BRST formalism, where the 
world sheet variables include the bc ghosts intro- 
duced in gauge-fixing the world sheet metric to the 
conformal gauge g, ~ab. (An alternative approach 
(Hata et al.), based on covariantizing light-cone SFT, 
will not be described in this article.) Much less is 
presently known for the superstring: classical actions 
have been established for the Neveu-Schwarz sector 
of the open superstring (Berkovits 2001) and for the 
heterotic string (Berkovits et al. 2004). 

During the first period of intense activity in SFT 
(1985-1992), the covariant bosonic actions were 
constructed and shown to pass the basic test of 
reproducing the S-matrix [1] to each order in the 
perturbative expansion. The more recent revival of 
the subject (since 1999) was triggered by the 
realization that SFT contains nonperturbative infor- 
mation as well: D-branes emerge as solitonic 
solutions of the classical equations of motion in 


open SFT (OSFT). We can hope that the nonpertur- 
bative string dualities will also be understood in the 
framework of SFT, once covariant SFTs for the 
superstring are better developed. 

In this article, we review the basic formalism of 
covariant SFT, using for illustration purposes the 
simplest model — cubic bosonic OSFT. We then 
briefly sketch the generalization to bosonic SFTs 
that include closed strings. Finally, we turn to the 
subjects of classical solutions in OSFT and the 
physics of the open-string tachyon. 


Open Bosonic SFT 


The standard formulation of string theory starts with 
the choice of an on-shell spacetime background where 
strings propagate. In the bosonic string, the closed 
string background is described by a conformal field 
theory of central charge 26 (the “matter” CFT). The 
total world sheet CFT is the direct sum of this matter 
CFT and of the universal ghost CFT, of central charge 
26. To describe open strings, we must further specify 
boundary conditions for the string endpoints. The 
open-string background is encoded in a boundary CFT 
(BCFT), a CFT defined in the upper-half plane, with 
conformal boundary conditions on the real axis 
(see Boundary Conformal Field Theory in this encyclo- 
pedia). In modern language, the choice of BCFT 
corresponds to specifying a D-brane state. 

In classical OSFT, we fix the closed-string back- 
ground (the bulk CFT) and consider varying the 
D-brane configuration (the boundary conditions). 
To lowest order in g,, we can neglect the back- 
reaction of the D-brane on the closed-string fields, 
since this is a quantum effect from the open-string 
viewpoint. Let us prepare the ground by recalling 
the standard o-model philosophy. To describe off- 
shell open-string configurations, we should allow for 
general (not necessarily conformal) boundary condi- 
tions. We can imagine to proceed as follows: 


1. We choose an initial open-string background, a 
reference BCFT that we shall call BCFT 9. For 
example, a Dp brane in flat 26 dimensions 
(Neumann boundary conditions on p + 1 coordi- 
nates, Dirichlet on 25 — p coordinates). 

2. We then write a basis of boundary perturbations 
around this background. Taking, for example, 
BCFT to be a D25 brane in flat space, the world 
sheet action Sws takes the schematic form 


Sus OX,0X" + | T (x^!) 
R 


~ 2ra Juur 
+ Ay(x?)OX” +B APE --.. [2] 
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Here to the standard free bulk action (integrated 
over the upper-half complex plane UHP) we have 
added perturbation localized on the real axis R. 
Notice that the basis of perturbations depends on 
the chosen BCFTọo. 

3. We interpret the coefficients {d'(x!)} of the 
perturbations as spacetime fields. (The tilde on 
Ó' (x) serves as a reminder that these fields are not 
quite the same as the fields (x) that will appear 
in the OSFT action). We are after a spacetime 
action S{{d‘}] such that solutions of its classical 
equations of motion correspond to conformal 
boundary conditions: 


8s 
5d! 
> BiH Y} = 0 (world sheet) [3] 


= O(spacetime) 


We recognize in [2] the familiar open-string 
tachyon T(x) and gauge field A,,(x), which are the 
lowest modes in an infinite tower of fields. Relevant 
perturbations on the world sheet (with conformal 
dimension h < 1) correspond to tachyonic fields in 
spacetime (m? < 0), whereas marginal world sheet 
perturbations (5 = 1) give massless spacetime fields. 
To achieve a complete description, we must include 
all the higher massive open-string modes as well, 
which correspond to nonrenormalizable boundary 
perturbations (5 > 1). In the traditional o-model 
approach, this appears like a daunting task. The 
formalism of OSFT will automatically circumvent 


this difficulty. 


The Open-String Field 


In covariant SFT the reparametrization ghosts play 
a crucial role. The ghost CFT consists of the 
Grassmann odd fields b(z), c(z), b(Z), c(z), of dimen- 
sions (2,0), (— 1,0), (0, 2), (0, — 1), respectively. The 
boundary conditions on the real axis are 
b=b,c=c. The state space Hgcrr, of the full 
matter +ghost BCFT can be broken up into 
subspaces of definite ghost number, 


(G 
HacrT, = > Hacer, [4] 
G=—oo 


We use conventions where the SL(2, R) vacuum |0) 
carries zero ghost number, G(|0))=0, while 
G(c)=+1 and G(b)= —1. As is familiar from the 
first-quantized treatment, physical open-string states 
are identified with G= +1 cohomology classes of 
the BRST operator, 


Ol V phys) = 0, 
G(|Vohys)) = +1 


| Vphys) = | Votos) T O|A) [5] 
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where the nilpotent BRST operator O has the 
standard expression 


1 f u 
O= zu f (cT matter + : bcOc :) [6] 


Though not a priori obvious, it turns out that the 
simplest form of the OSFT action is achieved by 
taking as the fundamental off-shell variable an 
arbitrary G=+1 element of the first-quantized 
Fock space, 


l 
|®) € Hier. [7] 


By the usual state-operator correspondence of CFT, 
we can also represent |®) as a local (boundary) 
vertex operator acting on the vacuum, 


|) = &(0)/0) [8] 


The open-string field |®) is really an infinite- 
dimensional array of spacetime fields. We can 
make this transparent by expanding it as 


a= | PUKE) 


where {|®;(k))} is some convenient basis of Her. 
that diagonalizes the momentum k,. The fields 
o' are a priori complex. This is remedied by 
imposing a suitable reality condition on the string 
field, which will be stated momentarily. Notice that 
there are many more elements in (|P;(k))] than in the 
physical subspace (the cohomology classes of Q). 
Some of the extra fields will turn out to be 
nondynamical and could be integrated out, but at 
the price of making the OSFT action look much 
more complicated. 

It is often useful to think of the string field in 
terms of its Schrödinger representation, that is, as a 
functional on the configuration space of open 
strings. Consider the unit half-disk in the upper- 
half plane, Dy = {|z| € 1, Sz > 0), with the vertex 
operator (0) inserted at the origin. Impose BCFT 
open string boundary conditions for the fields X(z, z) 
on the real axis (here X(z, z) is a short-hand notation 
for all matter and ghost fields), and boundary 
conditions X(0) = X;(a) on the curved boundary of 
Dg,z-— exp(ic), 0 € o € v. The path integral over 
X(2,7) in the interior of the half-disk assigns a 
complex number to any given X;(c), so we obtain a 
functional ®[X;(c)]. This is the Schrödinger wave 
function of the state P(0)/0). Thus, we can think of 
open-string functionals P|X,(0)] as the fundamental 
variables of OSFT. This is as it should be: the 


first-quantized wave functions are promoted to 
dynamical fields in the second-quantized theory. 
Finally, let us quote the reality condition for the 
string field, which takes a compact form in the 
Schródinger representation: 


$[X" (a), b(a), c(c)] 
= $'[X"(r—eo)b(r—o)c(r—oe) [10] 


where the superscript * denotes complex conjugation. 


The Classical Action 


With all the ingredients in place, it is immediate to 
write the quadratic part of the OSFT action. The 
linearized equations of motion must reproduce the 
physical-state condition [5]. This suggests 


S ~ (9$|O|9) [11] 


Here (|) is the usual BPZ inner product of BCFT9, 
which is defined in terms of a two-point correlator on 
the disk, as we review below. The ghost anomaly 
implies that on the disk we must have Gs = +3, 
which happily is the case in [11]. Moreover, since the 
inner product is nondegenerate, variation of [11] gives 


O|®) = 0 [12] 


as desired. The equivalence relation |V hys) ~ 
|Vonys) + O|A) is interpreted in the second-quantized 
language as the spacetime gauge invariance 


6119) = OJA), 14) € Herr, [13] 


valid for the general off-shell field. This equation is 
a very compact generalization of the linearized 
gauge invariance for the massless gauge field. 
Indeed, focusing on the level-zero components, 
I9) ~ A,,(x)(cOX")(0)|0) and |A) ~ A(x)/0), we find 
6A,,(x) =0,,A(x). It is then plausible to guess that the 
nonlinear gauge invariance should take the form 


ôal®) = OJA) + |®) * |A) — JA) =|) [14] 


where * is some suitable product operation that 
conserves ghost number 


‘ (n) (m) (n+m) 
* Hacer, 9 Hacer, > Herr, [15] 


Based on a formal analogy with 3D nonabelian 
Chern-Simons theory, Witten proposed the cubic 
action 


tei 1 
= —— | >- (POIS) +5 (PP x P 1 
s=- 5 (3000+300) ne 
The string field |j) is analogous to the Chern- 
Simons gauge potential A = A;dx', the * product to 
the ^ product of differential forms, O to the exterior 
derivative d, and the ghost number G to the degree 


of the form. The analogy also suggests a number of 
algebraic identities: 


Q* = 6 

(QA|B) = —(—1)°) (A|QB) 

Q(A + B) = (QA dies D" A*(0B) u5 
(A|B) = (—1)9 90 (BIA) 

hoe dn — oh 

x (B* C) = (A xB) + 


Note in particular the associativity of the *-product. 
It is straightforward to check that this algebraic 
structure implies the gauge invariance of the cubic 
action under [14]. A *-product satisfying all required 
formal properties can indeed be defined. The most 
intuitive presentation is in the functional language. 
Given an open-string curve X(c),0€ o € 7, we 
single out the string mid-point c — 7/2 and define 
the left and right *half-string" curves 


Xi (e) = Xto) forü0 co <2 
;. ^ [8 
Xr(o) = X(r—0) for Jerga 


A functional ®|X(c)] can, of course, be regarded as 
a functional of the two  half-strings, [X] — 
PIX, Xr]. We define 


(5, x 0,)[XL, Xp] = J (dY|&,[X,, Y]0>[Y, Xr] [19] 


where f [dY] is meant as the functional integral over 
the space of half-strings Y(o), with Y(z/2)— 
X¡(7/2)=Xp(7/2). Figure la shows two open 
strings interacting (to form a single open string) if 
and only if the right half of the first string precisely 
overlaps with the left half of the second string. 
Associativity is transparent (Figure 1b). 

We can now translate this formal construction in 
the precise CFT language. Very generally, an 2-point 
vertex of open strings can be defined by specifying 
an a-punctured disk, that is, a disk with marked 
points on the boundary (punctures) and a choice of 


local coordinates around each puncture. Local 
" 
A / E | 
/ / CSS B | J 
1/2 3 j 
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(a) (b) 


Figure 1 Midpoint overlaps of open strings. 
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coordinates are essential since we are dealing with 
off-shell open-string states. The BPZ inner product 
(two-point vertex) is given by 


(©; |) = (I o $1 (0) $5(0))upy 
I(z) — == 


The symbol f o ®(0), where f is a complex map, 
means the conformal transform of ®(0) by f. For 
example, if ® is a dimension-d primary field, then 
f o &(0) = f'(0)^&(f( 0). If 9 is nonprimary, the 
transformation rule will be more complicated and 
involve extra terms with higher derivatives of f. By 
performing the SL(2, C) transformation 


1 +12 
1 — iz 


w = h(z) = [21] 
we can represent the two-point vertex as a corre- 
lator on the unit disk D — (|w| < 1}, 


(1/82) = (fi o ®1(0), f o P2(0))p 
fi(z1) = —b(z1),  f(22) = h(22) 


The vertex operators are inserted as w= —1 and 
w= 4-1 on D (see Figure 2a) and correspond to the 
two open strings at (Euclidean) world sheet time 
T= —oo (we take z= exp (io + 7)). The left half of 
D is the world sheet of the first open string; the right 
half of D is the world sheet of the second string. The 
two strings meet at 7=0 on the imaginary w axis. 
The three-point Witten vertex is given by 


($, ;*5 , $3) 
= (gı o ®ı(0)g2 o &2(0)g3 o 93(0))p [23] 


[22] 


where 


1—izı 
1+iz2 = 
alos 24 
g2(22) € = =) [24] 
ai 1 +123 M 
— 4—27i/3 
g3(23) =e € - =) 
Iw 
(4 Qo 


(a) (b) 
Figure 2 Representation of the quadratic and cubic vertices as 
2- and 3-punctured unit disks. 
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The 3-punctured disk is depicted in Figure 2b, and 
describes the symmetric mid-point overlap of the 
three strings at 7=0. Finally, the relation between 
the three-point vertex and the *-product is 


($| * P3) = (P¡, D,, 3) [25] 


Knowledge of the right-hand side (RHS) in [25] for 
all $ allows to reconstruct the *-product. All formal 
properties [17] are easily shown to hold in the CFT 
language. This completes the definition of the OSFT 
action. 

Evaluation of the classical action is completely 
algorithmic and can be carried out for arbitrary 
massive states, with no fear of divergences, since in 
all required correlators the operators are inserted 
well apart from each other. 


Quantization 


Quantization is defined by the path integral over the 
second-quantized string field. The first step is to deal 
with the gauge invariance [14] of the classical action. 
The gauge symmetry is reducible: not all gauge 
parameters A) (the superscript labels ghost number) 
lead to a gauge transformation. This is clear at the 
linearized level; indeed, if A'!— DAC, then 
640 BD) = QA —0. Thus, the set (A) gives a 
redundant parametrization of the gauge group. 
Characterizing this redundancy is somewhat subtle, 
since fields of the form AU = QAC? do not really 
lead to a redundancy in AU), and so on, ad infinitum. 
It is clear that we need to introduce an infinite tower 
of (second-quantized) ghosts for ghosts. 

The Batalin-Vilkovisky formalism is a powerful way 
to handle the problem. The basic object is the master 
action $(¢°, $7), which is a function of the “fields” ¿* 
and of the “antifields” $?. Each field is paired with a 
corresponding antifield of opposite Grassmanality. 
(“Grassmanality” is defined to be even or odd: a 
Grassmann even (odd) field is a commuting (antic- 
ommuting) field). The master action must obey the 
boundary condition of reducing to the classical action 
when the antifields are set to zero. (Note that in general 
the set of fields à? will be larger than the set of fields ¢' 
that appear in the classical action). Independence of the 
S-matrix on the gauge-fixing procedure is equivalent to 
the BV master equation 


5 {S, S} = —bAS [26] 
The antibracket {,} and the A operator are defined as 
0AB OAOB 


AS age 06; 00; 0o 
E 27 
a ES =—.= 
Ops 0o: 


where ô and 0, are derivatives from the left and 
from the right. It is often convenient to expand $ in 
powers of h, S= So + bS; + H Sa +--+, with 


{So, So} = 0 


{So,51} + {So, $1) = —2hASp,... e 
With these definitions in place, we shall simply 
describe the answer, which is extremely elegant. In 
OSFT the full set of fields and antifields is packaged 
in a single string field |®) of unrestricted ghost 
number. If we write 


|®) = |®_) + |6,) 29] 
with G($ ) € 1 and G(®,) > 2 
all the fields are contained in |®_) and all the 
antifields in |®,). To make the pairing explicit, we 
pick a basis {|®,)} of Hgcrr,, and define a conjugate 
basis {|®°)} by 


(DEIB) = 6, [30] 
Clearly, G(®°) + G(®,) 23. Then 


|o.) = [DH [31] 


(9. = EOL ; 
G(®;)<1 


G(®;)<1 


Basis states |®,) with even (odd) ghost number 
G(®,) are defined to be Grassmann even (odd). The 
full string field |®) is declared to be Grassmann 
odd. It follows that à? is Grassmann even (odd) for 
G(®,) odd (even), and that the corresponding 
antifield ¢* has the opposite Grassmanality of ¢5, 
as it must be. With this understanding of |®), the 
classical master action Sọ is identical in form to the 
Witten action [16]! The boundary condition is 
satisfied; indeed, setting |®,) — 0, the ghost number 
anomaly implies that only the terms with G — +1 
survive. The equation (So,S0)=0 follows from 
straightforward manipulations using the algebraic 
identities [17]. On the other hand, the issue of 
whether ASy=0, or whether instead quantum 
corrections are needed to satisfy full BV master 
equation, is more subtle and has never been fully 
resolved. The A operator receives singular contri- 
butions from the same region of moduli space 
responsible for the appearance of closed-string 
poles, which are discussed below. (See Thorn 
(1989) for a classic statement of this issue). It 
seems possible to choose a basis in Hgcrr, such that 
there are no quantum corrections to Sg (Erler and 
Gross 2004). In the following we shall derive the 
Feynman rules implied by Sọ alone. 


SFT Diagrams and Minimal Area Metrics 


Imposing the Siegel gauge condition bo — 0, one 
finds the gauge-fixed action 


1/1 1 
Sof = “a É (PlcoLo|D) + z 9e * o) 


O 


+(8 2) 32 


where ( is a Lagrangian multiplier. The propagator 
reads 


a code / | dTe To (33) 
Lo 0 


Since Lo is the first-quantized open-string Hamilto- 
nian, e 1% is the operator that evolves the open- 
string wave functions V[X(c)] by Euclidean world 
sheet time T. It can be visualized as a flat 
rectangular strip of “horizontal” width m and 
“vertical” height T. Each propagator comes with 
an antighost insertion 


- / blo) 34) 


integrated on a horizontal trajectory. 

The only elementary interaction vertex is the mid- 
point three-string overlap, visualized in Figure 3. We 
are instructed to draw all possible diagrams with 
given external legs (represented as semi-infinite 
strips), and to integrate over all Schwinger para- 
meters T; €[0,00) associated with the internal 
propagators. The claim is that this prescription 
reproduce precisely the first-quantized result [1]. 
This follows if we can show that (1) the OSFT 
Feynman rules give a unique cover of the moduli 
space of open Riemann surfaces; (2) the integration 
measure agrees with the measure [dua] in [1]. The 
latter property holds because the antighost insertion 
[34] is precisely the one prescribed by the Polyakov 
formalism for integrating over the moduli T;. To 
show point (1), we introduce the concept of 
minimal-area metrics, which has proved very 
fruitful. (Here and below, our discussion of 


Figure 3 The cubic vertex represented as the mid-point gluing 
of three strips. 
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minimal-area metrics will summarize ideas devel- 
oped mainly by Zwiebach.) Quite generally, the 
Feynman rules of an SFT provide us with a cell 
decomposition of the appropriate moduli space of 
Riemann surfaces, a way to construct surfaces in 
terms of vertices and propagators. Given a Riemann 
surface (for fixed values of its complex moduli), the 
SFT must associate with it one and only one string 
diagram. The diagram has more structure than the 
Riemann surface: it defines a metric on it. In all 
known covariant SFTs, this is the metric of minimal 
area obeying suitable length conditions. Consider 
the following: 


Minimal-area problem for open SFT Let R, be a 
Riemann surface with at least one boundary 
component and possibly punctures on the boundary. 
Find the (conformal) metric of minimal area on R, 
such that all nontrivial Jordan open curves have 
length greater than or equal to z. (A curve is said to 
be nontrivial if it cannot be continuously shrunk to a 
point without crossing a puncture.) 


An OSFT diagram (for fixed values of its Tj), 
defines a Riemann surface R, endowed with a 
metric solving this minimal-area problem. This is the 
metric implicit in its picture: flat everywhere except 
at the conical singularities of defect angle (n — 2)« 
when n propagators meet symmetrically. (For n= 3, 
these are the elementary cubic vertices; for n > 3, 
they are effective vertices, obtained when propaga- 
tors joining cubic vertices collapse to zero length.) It 
is not difficult to see both that the length conditions 
are obeyed, and that the metric cannot be made 
smaller without violating a length condition. Con- 
versely, any surface R, endowed with a minimal- 
area metric, corresponds to an OSFT diagram. The 
idea is that the minimal-area metric must have open 
geodesics (“horizontal trajectories”) of length 7 
foliating the surface. The geodesics intersect on a 
set of measure zero — the “critical graph” where the 
propagators are glued. Bands of open geodesics of 
infinite height are the external legs of the diagram, 
while bands of finite height are the internal 
propagators. 

The single cover of moduli space is then ensured 
by an existence and uniqueness theorem for metrics 
solving the minimal-area problem for OSFT. These 
metrics are seen to arise from Jenkins—Strebel 
quadratic differentials. Existence shows that the 
Feynman rules of OSFT generate each Riemann 
surface R, at least once. Uniqueness shows that 
there is no overcounting: since different diagrams 
correspond to different metrics (by inspection of 
their picture), no Riemann surface can be generated 
twice. 
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Closed Strings in OSFT 


As is familiar, the open-string S-matrix contains 
poles due to the exchange of on-shell open and 
closed strings. The closed-string poles are present in 
nonplanar loop amplitudes. We have seen that 
OSFT reproduces the standard S-matrix. Factoriza- 
tion over the open-string poles is manifest, it 
corresponds to propagator lengths T; going to 
infinity. Surprisingly, the closed-string poles are 
also correctly reproduced, despite the fact that 
OSFT treats only the open strings as fundamental 
dynamical variables. In some sense, closed strings 
must be considered as derived objects in OSFT. 
Factorizing the amplitudes over the closed-string 
poles, one finds that on-shell closed-string states can 
be represented, at least formally, as certain singular 
open-string fields with G = +2, closely related to the 
(formal) identity string field. The picture is that of a 
folded open string, whose left and right halves 
precisely overlap, with an extra closed-string vertex 
operator inserted at the mid-point. The correspond- 
ing open/closed vertex is given by 


(Y phy, |) oc = (F hys (0)Z O $(0))5 
DF i à [35] 
ze (rta) 


1 — iz 


and describes the coupling to the open-string field of 
a nondynamical, on-shell closed string [W ys). It is 
possible to add this open/closed vertex to the OSFT 
action. Remarkably, the resulting Feynman rules 
give a single cover of the moduli space of Riemann 
surfaces with at least one boundary, with open and 
closed punctures. This is shown using the same 
minimal-area problem as above, but now allowing 
for surfaces with closed punctures as well. 

We should finally mention that the structure of 
OSFT emerges frequently in topological string 
theory, in contexts where open/closed duality plays 
a central role. Two examples are the interpretation 
of Chern-Simons theory as the OSFT for the 
A-model on the conifold, and the intepretation of 
the Kontsevich matrix integral for topological 
gravity as the OSFT on FZZT branes in (2,1) 
minimal string theory. 


Closed Bosonic SFT 


The generalization to covariant closed SFT is 
nontrivial, essentially because the requisite closed- 
string decomposition of moduli space is much more 
complicated. 

The free theory parallels the open case, with a 
minor complication in the treatment of the CFT zero 


modes. The closed-string field is taken to live in a 
subspace of the matter + ghost state space, |W) € 
HCFTo» where the tilde means that we impose the 
subsidiary conditions 


bo |Y) = Lo |W) =0, by = bo — bo, 


ý ; [36] 
Lo = Lo = Lo 


In the classical theory, the string field carries ghost 
number G = +2, since it is the off-shell extension of 
the familiar closed-string physical states, and the 
quadratic action reads 


S~ (Y, QT) [37] 


Here O. is the usual closed BRST operator. The inner 
product (,) is defined in terms of the BPZ inner 
product, with an extra insertion of cy = co — Co, 


(A, B) = (Alco |B) EJ 


In [37] Grop= +6, as it should be. Without the 
extra ghost insertion and the subsidiary conditions 
[36] it would not be possible to write a quadratic 
action. The linearized equations of motion and 
gauge invariance, 


Q.|V) 20, |W) ~ |Y) + QA), JA) € Herr, [39 


give the expected cohomological problem. The fact 
that the cohomology is computed in the semirelative 
complex, b¿|W)=b,|A)=0, well known from the 
operator formalism of the first-quantized theory, is 
recovered naturally in the second-quantized treatment. 

The interacting action is constructed iteratively, 
by demanding that the resulting Feynman rules give 
a (unique) cover of moduli space. This requires the 
introduction of infinitely many elementary string 
vertices Y, n, where n is the number of closed-string 
punctures and g the genus. This decomposition of 
moduli space is more intricate than the decomposi- 
tion that arises in OSFT, but is in fact analogous to 
it, when characterized in terms of the following. 


Minimal-area problem for closed SFT Let Re be a 
closed Riemann surface, possibly with punctures. 
Find the (conformal) metric of minimal area on R 
such that all nontrivial Jordan closed curves have 
length greater than or equal to 27. 


The minimal-area metric induces a foliation of 
Re by closed geodesics of length 27. In the classical 
theory (g— 0), the minimal-area metrics arise from 
Jenkins-Strebel quadratic differentials (as in the open 
case), and geodesics intersect on a measure-zero set. 
For g > 0, however, there can be foliation bands of 
geodesics that cross. By staring at the foliation, we can 
break up the surface into vertices and propagators. In 
correspondence with each puncture, there is a band of 


infinite height, a flat semi-infinite cylinder of circum- 
ference 27, which we identify as an external leg of the 
diagram. We mark a closed geodesic on each semi- 
infinite cylinder, at a distance m from its boundary. 
Bands of finite height (internal bands not associated to 
punctures) correspond to propagators if their height is 
greater than 27, otherwise they are considered part of 
an elementary vertex. Along any internal cylinder of 
height greater than 27, we mark two closed geodesics, 
at a distance 7 from the boundary of the cylinder. If we 
now cut open all the marked curves, the surface 
decomposes into a number of semi-infinite cylinders 
(external legs), finite cylinders (internal propagators) 
and surfaces with boundaries (elementary interac- 
tions). Each elementary interaction of genus g and 
with n boundaries is an element of Vg, „n. A crucial point 
of this construction is that we took care of leaving a 
“stub” of length x attached to each boundary. Stubs 
ensure that sewing of surfaces preserves the length 
condition on the metric (no closed curve shorter 
than 27). 

These geometric data can be translated into an 
iterative algebraic construction of the full quantum 
action S[W]. The Y,, satisfy geometric recursion 
relations whose algebraic counterpart is the quan- 
tum BV master equation for S[W]. Remarkably, the 
singularities of the A operator encountered in OSFT 
are absent here, precisely because of the presence of 
the stubs. We refer to Zwiebach (1993) for a 
complete discussion of closed SFT. 


Open/Closed SFT 


There is also a covariant SFT that includes both open 
and closed strings as fundamental variables. The 
Feynman rules arise from the following problem. 


Minimal-area problem for open/closed SFT Let 
Roc be a Riemann surface, with or without 
boundaries, possibly with open and closed punctu- 
res. Find the (conformal) metric of minimal area on 
Roc such that all nontrivial Jordan open curves have 
length greater than or equal to /],=7, and all 
nontrivial Jordan closed curves have length greater 
than or equal to l — 27. 


The surface Ro. is decomposed in terms of 
elementary vertices V7", (of genus g, b boundary 
components, n closed-string punctures and m open- 
string punctures) joined by open and closed propa- 
gators. Degenerations of the surface correspond 
always to propagators becoming of infinite length — 
factorization is manifest both in the open and in the 
closed channel. 

The SFT described in the section “Closed strings 
in OSFT" (Witten OSFT augmented with the single 
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open/closed vertex [35]) corresponds to taking |, — 7 
and /,=0. Varying k € [0,27], we find a whole 
family of interpolating SFTs. This construction 
clarifies the special status of the Witten theory: 
moduli space is covered by a single cubic open 
overlap vertex, with no need to introduce dynamical 
closed strings, but at the price of a somewhat 
singular formulation. 


Classical Solutions in Open SFT 


In the present formulation of SFT, a background (a 
classical solution of string theory) must be chosen from 
the outset. The very definition of the string field 
requires to specify a (B)CFTo. Intuitively, the string 
field lives in the “tangent” to the “theory space” at a 
specific point — where “theory space” is some notion of 
a "space of 2D (boundary) quantum field theories," 
not necessarily conformal. In the early 1990s indepen- 
dence from the choice of background was demon- 
strated for infinitesimal deformations: the SFT actions 
written using neighboring (B)CFTs are indeed related 
by a field redefinition. In recent years, it has become 
apparent that at least the open-string field reaches out 
to open-string backgrounds a finite distance away — 
possibly covering the whole of theory space. (Classical 
solutions of closed SFT are beginning to be investi- 
gated at the time of this writing (2005)). 

The OSFT action written using BCFT 9 data is just 
the full world volume action of the D-brane with 
BCFT9 boundary conditions. Which classical solu- 
tions should we expect in this OSFT? In the bosonic 
string, Dp branes carry no conserved charge and are 
unstable. This instability is reflected in the presence 
of a mode with m*=—1/a’, the open-string 
tachyon T(x^),j/,— 0,..., p. From this physical pic- 
ture, Sen argued that: 


1. the tachyon potential, obtained by eliminating 
the higher modes of the string field by their 
equations of motion, must admit a local mini- 
mum corresponding to the vacuum with no 
D-brane at all (henceforth, the tachyon vacuum, 
T(x") = To); 

2. the value of the potential at To (measured with 
respect to the BCFT 9 point T=0) must be 
exactly equal to minus the tension of the brane 
with BCFT» boundary conditions; 

3. there must be no perturbative open-string excita- 
tions around the tachyon vacuum; and 

4. there must be space-dependent “lump” solutions 
corresponding to lower-dimensional branes. For 
example, a lump localized along one world 
volume direction, say x!, such that T(x!) — To 
as x! — +00, is identified with a D(p — 1) brane. 
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Sen’s conjectures have all been verified in OSFT. 
(See Sen (2004) and Taylor and Zwiebach (2003) 
for reviews). The deceptively simple-looking equa- 
tions of motion (in Siegel gauge) 


Lo|®) + bo(|D) * |®)) = 0 (40) 


are really an infinite system of coupled equations, 
and no analytic solutions are known. Turning on a 
vacuum expectation value (VEV) for the tachyon 
drives into condensation an infinite tower of modes. 
Fortunately, the approximation technique of “level 
truncation” is surprisingly effective. The string field 
is restricted to modes with an Lo eigenvalue smaller 
than a prescribed maximal level L. For any finite L, 
the truncated OSFT contains a finite number of 
fields and numerical computations are possible. 
Numerical results for various classical solutions 
converge quite rapidly as the level L is increased. 

The most important solution is the string field |7) 
that corresponds to the tachyon vacuum. A remark- 
able feature of |7) is universality: it can be written 
as a linear combination of modes obtained by acting 
on the tachyon c4|0) with ghost oscillators and 
matter Virasoro operators, 


I7) = To c1|0) + u L”,c1]0) -— v c 4|0) deis 


This implies that the properties of |7) are indepen- 
dent of any detail of BCFT 9, since all computations 
involving |7) can be reduced to purely combinator- 
ial manipulations involving the ghosts and the 
Virasoro algebra. The numerical results strongly 
confirm Sen’s conjectures, and indicate that the 
tachyon vacuum is located at a non-singular point in 
configuration space. Numerical solutions describing 
lower-dimensional branes and exactly marginal 
deformations are also available. For example, the 
full family of solutions interpolating between a 
D1 and a DO brane at the self-dual radius has 
been found. There is increasing evidence that the 
open-string field provides a faithful map of the 
open-string landscape. 


Vacuum SFT: D-branes as Projectors 


In the absence of a closed-form expression for |T}, 
we are led to guesswork. When expanded around 
|T}, the OSFT is still cubic, only with a different 
kinetic term Q, 


S= ro (0/Q/0) + (Ee) — (41 


The operator Q must obey all the formal properties 
[17], must be universal (constructed from ghosts and 
matter Virasoro operators), and must have trivial 
cohomology at G= +1. Another constraint comes 


from requiring that [41] admits classical solutions in 
Siegel gauge. The choice 


o = 5: (c) - eti) 


= co — (c2 +02) + (c4 c4) —*-- [42] 


satisfies all these requirements. The conjecture 
(Rastelli et al. 2001) is that, by a field redefinition, 
the kinetic term around the tachyon vacuum can be 
cast into this form. This “purely ghost” Q is 
somewhat singular (it acts at the delicate string 
mid-point), and presumably should be regarded as 
the leading term of a more complicated operator 
that includes matter pieces as well. The normal- 
ization constant Ko is formally infinite. Nevertheless, 
a regulator (e.g., level truncation) can be introduced, 
and physical observables are finite and independent 
of the regulator. The vacuum SFT ([41]-[42]) 
appears to capture the correct physics, at least at 
the classical level. Taking a matter/ghost factorized 
ansatz 


I$.) & | Pn) [43] 


and assuming that the ghost part is universal for all 
D-branes solutions, the equations of motion reduce 
to following equations for the matter part: 


(Pm) * |.) = |®m) [44] 


A solution |®,,) can be regarded as a projector 
acting in “half-string space." Recall that the 
*-product looks formally like a matrix multiplica- 
tion [19]: the matrices are the string fields, whose 
"indices" run over the half-string curves. These 
projector equations have been exactly solved by 
many different techniques (see Rastelli (2004) for a 
review). In particular, there is a general BCFT 
construction that shows that one can obtain solu- 
tions corresponding to any D-brane configuration, 
including multiple branes — the rank of the projector 
is the number of branes. A rank-one projector 
corresponds to an open-string functional which is 
left/right split, P[X(0)] - Fi (Xi)Fg(Xg). There is 
also clear analogy between these solutions and the 
soliton solutions of noncommutative field theory. 
The analogy can be made sharper using a formalism 
that rewrites the open-string *-product as the tensor 
product of infinitely many Moyal products. (See 
Bars (2002) and references therein). 

It is unclear whether or not multiple-brane 
solutions (should) exist in the original OSFT - they 
are yet to be found in level truncation. Under- 
standing this and other issues, like the precise role of 
closed strings in the quantum theory seems to 
require a precise characterization of the allowed 


space of open-string functionals. In principle, the 
path integral over such functionals would define the 
theory at the full nonperturbative level. This remains 
a challenge for the future. 


Note Added in Proof Very recently, M Schnabl, 
building on previous work on star algebra projectors 
and related surface states (Rastelli L (2004) and 
references therein) was able to find the exact 
solution for the universal tachyon condensate in 
OSFT. This breakthrough is likely to lead to rapid 
new developments in SFT. 


See also: Boundary Conformal Field Theory; BRST 
Quantization; Chern-Simons Models: Rigorous Results; 
Fedosov Quantization; The Jones Polynomial; Large-N 
and Topological Strings; Large-N Dualities; 
Noncommutative Geometry from Strings; 
Noncommutative Tori, Yang-Mills, and String Theory; 
Operads; Superstring Theories; Topological Quantum 
Field Theory: Overview; Two-Dimensional Conformal 
Field Theory and Vertex Operator Algebras. 
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String Theory and Compactification 


The string theory provides a setup in which gauge 
and gravitational interactions can be described in a 
unified framework consistently at the quantum level. 
As such, it provides a candidate theory in which to 
describe the standard model of particle physics 
(describing quarks and leptons and their strong and 
electroweak interactions) and gravity within the 
same quantum theory. 
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The string theory has a unique fundamental scale 
Ms, fixed by the string tension, often encoded in the 
parameter a’ of dimension (length) ^. All other 
scales are derived from this one and are background 
dependent. 

Most of the string theory phenomenological 
model building has centered on the critical super- 
strings, which are ten dimensional (10D) and 
involve spacetime (as well as world-sheet) super- 
symmetry. There are five such different 10D 
theories: type IIA, type IIB, type I, and the Eg x Eg 
and SO(32) heterotic theories. The heterotic theories 
include nonabelian gauge fields and charged fer- 
mions in ten dimensions; hence, they constitute a 
promising setup to embed the standard model. On 
the other hand, the possibility of including D-branes 
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(which carry nonabelian gauge symmetries and 
charged matter) in compactifications of type II 
theories (and orientifolds thereof, like the type I 
theory itself) makes the latter reasonable alternative 
setups to embed the standard model as a brane 
world. The different 10D theories (as well as the 
11D M-theory) are related by diverse dualities, also 
upon compactification. This suggests that they are 
just different limits of a unique underlying theory. 
For 4D models, this implies that the different classes 
of constructions are ultimately related by dualities, 
and that often a given model may be realized using 
different string theory constructions as starting 
points. 

In order to recover 4D physics at low energies, 
compactification of the theory is required. In 
geometrical terms, the theory is required to propa- 
gate on a spacetime with geometry M4 x Xe, where 
Ma is a 4D Minkowski space, and X¢ is a compact 
manifold. This description is valid in the regime of a 
large compactification volume, o//R? < 1 (where R 
is the overall scale of the compact manifold), where 
a’ string theory corrections are negligible. Other 4D 
string models may be constructed using abstract 
conformal field theories. They may often be 
regarded as extrapolations of geometric compactifi- 
cations to the regime of sizes comparable with the 
string length, where string theory corrections are 
relevant and the classical geometric picture does not 
hold. 

In the simplest situation of geometrical compacti- 
fication, not including additional backgrounds 
beyond the metric, the requirement of 4D spacetime 
supersymmetry (useful for the stability of the model, 
as well as of phenomenological interest) implies that 
the space X¿ is endowed with an SU(3) holonomy 
metric. Existence of such metrics is guaranteed for 
Calabi-Yau spaces, namely Kahler manifolds with 
vanishing first Chern class. 

There are a very large number of 4D super- 
symmetric string models that can be constructed 
using different starting string theories and different 
compactification manifolds. They lead to different 
4D spectra, often including nonabelian gauge sym- 
metries and charged chiral fermions (but only rarely 
resembling the actual standard model). In addition, 
for each given model, there exist, in general, a large 
number of massless 4D scalars, known as moduli, 
whose vacuum expectation values are not fixed. 
They parametrize different choices of the compacti- 
fication data in a given topological sector (e.g., 
Kahler and complex structure moduli of the internal 
Calabi-Yau space). All physical parameters of the 
4D theory vary continuously with the vacuum 
expectation values of these scalars. 


All such models are on equal footing from the 
point of view of the theory. Hence, 4D string models 
suffer from a large arbitrariness. Although the 
breaking of supersymmetry clearly changes the 
picture qualitatively (e.g., flat directions associated 
to moduli are lifted by radiative corrections), it is 
difficult to evaluate this impact. 

In this situation, most of the research in string 
theory phenomenology has centered on the study of 
generic properties of certain classes of compactifica- 
tions, with the potential to lead to realistic struc- 
tures (such as N=1 or no supersymmetry, 
nonabelian gauge symmetries with replicated sets 
of charged chiral fermions). Within each class, 
explicit models (as close as possible to the standard 
model) have also been constructed. Generic predic- 
tions or expectations for phenomenology can be 
obtained within each setup, but quantitative results, 
even for explicit models, are always functions of 
undetermined moduli vacuum expectation values. 
Tractable mechanisms for moduli stabilization are 
under active research, although only preliminary 
results are available presently. 

The better-studied classes of models are compac- 
tifications of heterotic theories on Calabi-Yau 
spaces, and compactifications of type II theories (or 
orientifolds thereof) with D-branes. Other possibi- 
lities include the heterotic M-theory, the M-theory 
on Gz holonomy varieties, the F-theory on Calabi- 
Yau 4-folds, etc. As already mentioned, different 
classes (or even explicit models) are often related by 
string duality. 


Heterotic String Phenomenology 


A large class of phenomenologically interesting 
string vacua, which has been explored in depth, is 
provided by 4D compactifications of (any of the 
two) perturbative heterotic string theories. Compac- 
tification on large volume manifolds can be 
described in the supergravity approximation. As 
described by Candelas, Horowitz, Strominger, and 
Witten, the requirement of 4D N = 1 supersymmetry 
requires the internal manifold to be of SU(3) 
holonomy, a condition which is satisfied by 
Calabi-Yau manifolds. In the presence of a curva- 
ture, the Bianchi identity for the Kalb-Ramond 
2-form B is modified, so that, in general, it reads 


dH = RP [1] 


where H is the field strength 3-form, R is the Ricci 
2-form, and F is the field strength, in the adjoint 
representation, of the 10D gauge fields. Regarding 
the above equation in cohomology leads to a 


consistency condition, forcing the background gauge 
bundle V to be topologically nontrivial, with 


c2(V) = ca(TX6) [2] 


where c» denotes the second Chern class, and TX, is 
the compactification tangent space. 

The condition of supersymmetry implies that the 
gauge fields must be solutions of the Donaldson- 
Uhlenbeck-Yau equations. Existence of such a solu- 
tion is guaranteed for holomorphic and stable gauge 
bundles. The simplest solution to these conditions is 
the so-called standard embedding, where the gauge 
connection is locally identical to the spin connection, 
but more general solutions exist and have been 
characterized for particular classes of Calabi-Yau 
manifolds (e.g., when they are elliptically fibered). 
The gauge background bundle V, with structure 
group H, breaks the 10D gauge symmetry G to its 
commutant subgroup Gap. The latter corresponds to 
the 4D gauge symmetry. Moreover, the background 
bundle modifies the Kaluza-Klein reduction of the 
10D charged fermions, leading to a nonzero number 
of replicated 4D chiral fermions. Decomposing the 
adjoint representation of G (in which 10D fermions 
transform) with respect to G4p x H, 


Adj G = (Rep; Rn) [3] 


the net number of 4D chiral fermions in the 
representation RG, is given by the index of the 
Dirac operator coupled to V in the representation 
Ry,;. Condition [1] implies proper cancellation of 
chiral anomalies in the resulting theory. A simple 
and well-studied class is provided by standard 
embedding compactifications of the Eg x Eg hetero- 
tic string theory, whose unbroken 4D gauge group is 
Es x Eg. The number of families (i.e., chiral multi- 
plets in the representation 27 of Eg ) and conjugate 
families (in the 27) are given by the Hodge numbers 


ni = hyi(Xe), nz b2i(X«) [4] 


More specifically, the harmonic representatives in 
each cohomology class represent the internal profile 
of the corresponding 4D fields. The net number of 
families is thus determined by the Euler character- 
istic x(Xg) 


Hfam = Ibia - hy 1) = +|x(Xe)| [5] 


Recently, much progress in heterotic model building 
has been achieved in nonstandard embedding com- 
pactifications by the detailed construction of holo- 
morphic stable bundles and the computation of the 
diverse indexes. In particular, explicit models with 
just the minimal supersymmetric standard model 
spectrum have been constructed. 
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The above geometric approach has several limita- 
tions. On the technical side, the construction of 
explicit holomorphic and stable gauge bundles is 
nontrivial from the mathematical viewpoint. On the 
more fundamental side, it allows one to explore only 
the large volume limit of heterotic compactifications. 

Further insight into the latter aspect can be 
obtained via constructions based on exactly solvable 
conformal field theories (CFTs), which describe the 
world-sheet string dynamics in compactifications, 
including all a’ corrections, and, therefore, allowing 
one to enter the small volume regime. The simplest 
such compactifications are provided by toroidal 
orbifolds, which describe string propagation in 
quotients of toroidal compactifications by a discrete 
group I’. From the world-sheet viewpoint, they are 
described by 2D free CFT, but which include sectors 
of closed strings with boundary conditions twisted 
by elements of I’. The resulting 4D theory contains 
chiral fermions, arising from the untwisted and 
twisted sectors. In the former, the nonchiral spec- 
trum of toroidal compactification suffers a projec- 
tion onto the T-invariant states and leads to 
chirality. Twisted sectors are localized at the fixed 
points of the orbifold action, where the local 
supersymmetry is reduced, leading naturally to 
chiral fermions. 

Many of these models can be regarded as limits of 
compactifications on Calabi-Yau spaces in the limit 
in which they become locally flat and develop 
conical singularities (and similarly, their gauge 
bundles become locally flat and with curvature 
localized near the singular points). Indeed, flat 
directions involving moduli fields in the twisted 
sector often exist, which correspond to geometric 
blow-ups of the singular point that resolve the 
conical singularities to yield a smooth Calabi-Yau. 

The theories remain simple and solvable for any 
value of the untwisted moduli (namely moduli of the 
underlying toroidal compactification). This allows 
the discussion of their low-energy effective action 
including the explicit dependence on the untwisted 
moduli, while only partial results for the dependence 
on twisted moduli are known. 

Other approaches, such as free fermion construc- 
tions or Gepner models, also provide exact descrip- 
tions of compactifications, although only at a point 
of the moduli space, deep inside the small volume 
regime. 

Exact CFT constructions provide a small volume 
description of Calabi-Yau compactifications, at 
least for particular models. Moreover, their consis- 
tency conditions (modular invariance of the parti- 
tion function) provide a stringy version of the large 
volume geometric condition implied by eqn [2]. The 
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constructions also show the existence of full-fledged 
string theory constructions with properties similar to 
geometric compactifications, but incorporating all a’ 
corrections. 

Within the general class of perturbative heterotic 
string models, a certain number of phenomenologi- 
cally interesting statements are quite generic. 


e The 4D Planck scale Mp and gauge couplings gym 
(at the string scale) are related to the fundamental 
string scale by 


Ms = Mpgym [6] 


This implies that the string scale is close to the 4D 
Planck scale. In this situation, supersymmetry can 
stabilize the electroweak scale against radiative 
corrections. 

e 4D heterotic models contain certain U(1) symme- 
tries, whose gauge bosons actually get Stuckelberg 
masses due to B ^ F couplings to components of 
the 2-form. Such U(1)' would correspond to 
global symmetries, but are violated at tree level by 
a’ nonperturbative effects, namely world-sheet 
instantons. Hence, no continuous global symme- 
tries exist, even perturbatively, in these models. 
Proton decay might, however, be avoided by 
discrete global symmetries. In any event, even 
without such symmetries, the large fundamental 
scale suppresses the processes mediating proton 
decay. Thus, the proton lifetime is naturally larger 
than present experimental bounds. 

e Gauge coupling constants for the different gauge 
factors in the standard model unify at the string 
scale. This agrees with extrapolation from their 
electroweak values, assuming the minimal super- 
symmetric standard model content between the 
electroweak and string scale, up to a mismatch of 
scales (by a factor of 20). The latter may be 
addressed in diverse ways, such as threshold 
corrections, intermediate scales, or in the heterotic 
M-theory. 

e Yukawa couplings are, in principle, computable. 
Explicit computations have been carried out in 
standard embedding geometric compactifications 
(where they amount to the overlap integral of the 
internal profiles of the 4D fields, namely a 
topological intersection number), and in orbifold 
models. They are in general moduli dependent, so 
their quantitative analysis is involved. Qualita- 
tively, however, interesting patterns, such as 
hierarchical structures, are possible, for example, 
in specific orbifold models. 


Heterotic models have been studied beyond the 
perturbative regime. For instance, the construction 


of compactifications | including nonperturbative 
objects, namely 5-branes, has been pursued; so has 
been the strong coupling limit of the Eg x Eg 
heterotic, described by compactifications of the 
M-theory on an interval (the so-called heterotic 
M-theory or Horava-Witten theory). The strong 
coupling phenomena of the SO(32) heterotic theory 
can be addressed using dual type I (or other type II 
orientifold) constructions. 


D-Brane Phenomenology 


A different setup for realistic string theory compac- 
tifications, within the so-called brane-world con- 
structions, is provided by compactifications of type II 
string theories containing D-branes, or quotients 
thereof. A particularly relevant class of quotients 
involves quotienting out by world-sheet parity, 
accompanied by some Z; geometric action. The 
resulting theories are denoted type II orientifolds, and 
contain orientifold planes, subspaces fixed under the 
geometric action, corresponding to regions where the 
orientation of a string can flip. Type II compactifica- 
tions with D-branes filling the noncompact dimen- 
sions must satisfy a set of consistency conditions, 
known as RR tadpole cancellation. This is the 
condition that, in the compact space, the charge of 
D-branes and orientifold planes under the different 
RR forms must cancel. For the Z-valued charges, the 
conditions read 


js NQ, + Qo, = 0 [7] 


where N, denotes the multiplicity of D-branes with 
charge vector and O, under the RR fields, Qo, is the 
charge vector of the orientifold planes. Additional 
discrete conditions may be present if the relevant 
K-theory group (classifying D-brane charges in the 
corresponding background) contains torsion pieces. 
The most familiar example of these constructions 
is provided by the type I string theory, which is an 
orientifold quotient of the type IIB theory by world- 
sheet parity (with no geometric action). The model 
can be regarded as containing one orientifold 
9-plane and 32 D9 branes (all filling out 10D 
spacetime), such that their RR charges with respect 
to the (nondynamical) RR 10-form cancel. 
Supersymmetric geometric compactifications of 
type II theories and orientifolds must correspond to 
compactification on Calabi-Yau spaces in order to 
have a preserved spinor. Models with D-branes 
filling the noncompact dimensions may be broadly 
classified into two classes: type IIB compactifications 
with D(3 + 2p)-branes, wrapped on holomorphic 
2p-cycles, and carrying holomorphic and stable 


world-volume gauge bundles,-and type IIA compac- 
tifications with D6 branes wrapped on special 
Lagrangian 3-cycles (in general, models with D4 
and D8 branes are not allowed since Calabi-Yau 
spaces do not have nontrivial 1- or 5-cycles on 
which to wrap the branes). This classification is a 
large volume realization of the general classification 
of supersymmetric configurations of D-branes into 
two classes, denoted A and B. 


Intersecting Brane Worlds 


Type IIA compactifications with A-branes corre- 
spond to compactifications of type IIA theory (or 
orientifolds thereof) with D6 branes wrapped on 
3-cycles of the internal Calabi-Yau space. In these 
models, each stack of N D6 branes generically leads 
to a U(N) gauge factor. Chirality arises from open 
strings stretched between pairs of branes at the 
corresponding intersections. The chiral fermions 
from an open string stretched between branes a 
and b transform in the bifundamental representation 
(Oa, Op) of the gauge factors U(N,) x U(N;) of the 
intersecting D6 brane stacks. In general, two 
3-cycles in a 6D manifold intersect at points of the 
internal space. Hence, such fermions arise in several 
families, whose (net) number is given by the (net) 
number of intersections of the corresponding 
3-cycles Ia, p, namely the topological invariant 
intersection number of their homology classes 


Li a [Ma] 1 "m [8] 


Simple modifications of the above rules arise in 
some sectors in the presence of orientifold planes 
(e.g., the reduction of the gauge symmetry from 
unitary to orthogonal or symplectic factors for 
branes on top of orientifold planes). 

The RR tadpole cancellation conditions specify 
that the total homological charge carried by the D6 
branes (and the orientifold 6-planes) cancel. They 
imply automatic cancellation of cubic nonabelian 
anomalies, and the cancellation of mixed U(1) 
anomalies by a Green-Schwarz mechanism mediated 
by 4D scalars from the RR closed-string sector. 

Explicit models with SM spectrum have been 
constructed in orientifolds of toroidal compactifica- 
tions in the nonsupersymmetric case, and in orbi- 
folds thereof in supersymmetric cases. The 
generalization of the above construction beyond 
toroidal situations is, in principle, possible, but 
difficult, due to the mathematically challenging 
task of constructing special Lagrangian submani- 
folds for general Calabi-Yau manifolds. 

Certain phenomenologically interesting quantities, 
such as gauge couplings and their threshold 
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corrections, Yukawa couplings, and other diverse 
correlation functions have been computed in toroi- 
dal cases, where the corresponding correlators are 
computable exactly in a’. Particularly interesting is 
the computation of Yukawa couplings, or, in 
general, of couplings involving only fields at inter- 
sections. These couplings arise from open-string 
world-sheet instantons, namely disks with bound- 
aries on the D-branes corresponding to those 
intersections. 


Type IIB Orientifolds 


Type IIB compactifications with B-type branes 
contain several familiar classes of 4D models, for 
instance, compactifications of type I string theory on 
smooth Calabi-Yau spaces (whose description may 
be carried out using the effective supergravity 
action, in close analogy with the heterotic compac- 
tifications). Compactifications of type I string theory 
on orbifolds can be regarded as a particular 
realization of this, easily described using exact 
CFTs (although from the viewpoint of the general 
description as B-branes, the appearance of lower- 
dimensional branes requires their mathematical 
description to involve coherent sheaves). Since 
open strings at orbifolds do not have twisted 
boundary conditions, chirality arises from the orbi- 
fold projection of the toroidally compactified theory 
on the spectrum. 

Another example within this kind is provided by 
the so-called magnetized D-brane models. These 
correspond to toroidal compactifications of type I 
theory, with D9 branes carrying constant magnetic 
backgrounds for the internal components of the 
world-volume gauge fields. In this kind of model, 
although the closed-string sector is highly super- 
symmetric, the open-string spectrum has reduced 
supersymmetry, or no supersymmetry (if the bundle 
stability condition is relaxed). Chirality arises from 
the nontrivial index of the Dirac operator for open 
strings ending on D-branes with different world- 
volume magnetic fields. Explicit models have mainly 
centered on nonsupersymmetric models from orien- 
tifolds of Tf, and on supersymmetric models from 
orientifolds of the T°/(Z2 x Z2) orbifold. In both 
contexts, models with semirealistic spectra have 
been obtained: concretely nonsupersymmetric mod- 
els with just the standard model spectrum, or 
supersymmetric models with the minimal super- 
symmetric standard model spectrum, plus nonchiral 
matter. Further, properties of the gauge coupling 
constants and the computation of the Yukawa 
couplings have been studied as functions of unde- 
termined moduli. 
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Finally, a second large class of models constructed 
using B-type branes are given by lower-dimensional 
D-branes, for example, D3 branes, located at singular 
points in the internal compactification space. Since the 
massless sector of open strings is determined only in 
terms of the local structure of the singularity, these 
models have been mostly studied in noncompact 
setups. Resulting spectra can be encoded in quiver 
diagrams, related to those in the mathematical litera- 
ture on the McKay correspondence. Semirealistic three- 
family models have been constructed based on systems 
of D3 and D7 branes at the C? /Z3 orbifold singularity. 

Type IIB orientifold compactifications are also 
intimately related to F-theory compactifications on 
Calabi-Yau 4-folds, which provide a nonperturba- 
tive completion for such models. 

Mirror symmetry exchanges type IIB and HA 
compactifications with B- and A-type branes. Hence, 
it provides a map between the above two kinds of 
compactifications. This shows that type IIB orienti- 
fold models lead to spectra with structure similar to 
that of intersecting-branes worlds, and that they 
share many of their general properties. 

As a particular example, toroidal models of 
intersecting D6 branes are mapped under mirror 
symmetry to models of magnetized D9 branes. This 
mirror map has been exploited to construct the same 
theories from both starting points and to recover 
certain quantities, such as the a’-exact Yukawa 
couplings in the IIA picture from a purely classical 
(no a’ corrections) computation in the mirror IIB 
model. This is a particular application of the general 
proposal of homological mirror symmetry in com- 
pactifications with branes. 

Type II orientifold | compactifications with 
D-branes have also been explored beyond the 
geometric regime, using exact CFTs to describe the 
(analog of the) internal space, and crosscap and 
boundary states to describe (the analogs of) orienti- 
fold planes and D-branes. Formal developments in 
the construction of the latter in Gepner models have 
been successfully applied to obtain large classes of 
semirealistic 4D string models in this setup. 

As compared with heterotic compactifications, the 
setup of D-brane models leads to several generic 
features: 


e Since gauge sectors are localized on D-branes, and 
have a dilaton dependence different from gravita- 
tional interactions, the relation between the 
fundamental string scale and the 4D Planck scale 
and gauge coupling reads 


MP? Vr 


S 


M$ gy = [9] 


where Vr is a measure of the volume in the 
directions transverse to the brane, and g, is the 
10D string coupling. The above relation shows that 
it is possible to achieve large 4D Planck mass with 
a lower fundamental string scale by adjusting the 
transverse volume and the string coupling. This has 
been proposed by Antoniadis, Arkani-Hamed, 
Dimopoulos, and Dvali as an alternative to explain 
the Planck/weak hierarchy without supersymmetry. 

e The compactifications contain several U(1) gauge 
symmetries. For some of the corresponding gauge 
bosons, the 4D effective theory contains Stuckel- 
berg masses of order M,, due to B A F couplings 
to fields in the RR sector. These couplings make 
the U(1) gauge bosons massive; hence, they are 
absent from the low-energy physics. Nevertheless, 
the U(1)'s remain as global symmetries exact in a’ 
and to all orders in the perturbation theory in g,. 
They are violated by D-brane instantons, which 
are nonperturbative in g,. In many realistic 
models, the baryon number is one such global 
symmetry, and it prevents proton decay, even if 
the string scale is not large. 

e In general, each gauge factor in the standard 
model arises from a different brane stack, and 
their gauge couplings at the string scale are 
controlled by different moduli. This implies that, 
generically, it is not natural to have gauge 
coupling unification in D-brane models. Particular 
models may enjoy enhanced discrete global 
symmetries at special points in moduli space 
where unification is achieved, thus making uni- 
fication appear more natural in such examples. 
Similar statements apply for constructions which 
realize complete or partial unification of gauge 
groups at large scales (like string models of grand 
unification or of Pati-Salam type). 

e As already mentioned, important quantities such 
as Yukawa couplings are, in principle, computa- 
ble, although quantitative expressions have been 
derived only in a few examples, mostly in toroidal 
compactifications or quotients thereof. The results 
are moduli dependent, making it difficult to 
derive model-independent patterns. 


M-Theory Phenomenology 


Most of the phenomenological models from the 
M-theory have been constructed using the Horava- 
Witten theory (compactification of M-theory on 
S'/Z2) as starting point. This theory provides a 
description of the strong coupling regime of the 
Eg x Eg heterotic theory, and many of its basic 
features are similar to those in the perturbative 
regime. In particular, the techniques used in model 


building involve the construction of stable and 
holomorphic vector bundles and the computation 
of the relevant indexes to obtain the 4D gauge group 
and charge matter content. An important difference 
is that gauge interactions propagate only over the 
10D boundaries of spacetime, while gravity propa- 
gates over the 11 dimensions. This makes the setup 
share some features of brane-world constructions, 
and, in particular, it allows one to lower the 
fundamental scale of the theory (the 11D Planck 
scales) to reconcile it with the traditional unification 
scale. 

A different setup for M-theory phenomenology 
involves the compactification of the 11D theory on a 
7-manifold of G; holonomy X7, in order to lead to 
N — 1 supersymmetry in four dimensions. Although 
a fundamental formulation of the M-theory is 
lacking, duality arguments and indirect evidence 
can be used to show that nonabelian gauge 
symmetries of the A-D-E classical groups arise if 
X7 contains 3-cycles of codimension-4 singularities, 
locally of the form C^/T, with T an A-D-E Kleinian 
subgroup of SU(2). Similarly, it can be shown that 
chiral multiplets charged under these gauge symme- 
tries arise if X; contains certain codimension-7 
singularities. The local geometry of the latter has 
been explicitly described, and can be regarded as lying 
at the intersections of codimension-4 singularities. 

The direct construction. of such singular G; 
holonomy manifolds is very difficult, and there are 
no known topological conditions that guarantee 
existence of such a metric for a fixed topology. 
However, the existence of large classes of such 
models can be indirectly shown by using duality 
arguments. Namely, any type IIA models of inter- 
secting D6 branes and O6 planes, preserving N— 1 
supersymmetry, lifts to an M-theory compactifica- 
tion on a singular G2 holonomy manifold. In fact, 
the local structure of the codimension-4 and -7 
singularities agrees in particular cases with the local 
structure of D6 branes on 3-cycles and D6 brane 
intersections. 


Further Topics 


Some additional topics related to the phenomenol- 
ogy of the string theory, but not covered by the 
above model building description are discussed in 
the following. 


Effective Actions 


The construction of effective actions for such classes 
of models has been carried out in general in 
supersymmetric compactifications, using the 
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parametrization of the general 4D N=1 super- 
gravity action in terms of the Kahler potential for 
the moduli and matter fields, the gauge kinetic 
functions, and the superpotential. The moduli action 
is quite universal, at least for geometric compactifi- 
cations and for untwisted moduli in orbifold 
compactifications. For instance, the Kahler potential 
for the 4D dilaton multiplet $ and the modulus T 
controlling the size of the internal manifold, in the 
large volume and weak coupling regime, reads 


K = —log(S + S*) — 3log(T + T") [10] 


The corresponding expression including matter 
fields is more model dependent, but known within 
each particular class. 


Moduli Stabilization and Supersymmetry 
Breaking 


Both issues are often related. Although moduli 
stabilization preserving supersymmetry is possible, 
it often occurs that the potential stabilizing moduli 
has its origin in mechanisms related to super- 
symmetry breaking. 

The description of purely string theoretical 
mechanisms to break supersymmetry is difficult, 
and most approaches rely on field-theoretical 
mechanisms in the effective action. One of the better- 
studied mechanisms, mostly in the heterotic string 
setup (but also in type II compactifications), is 
gaugino condensation in a strongly coupled hidden 
sector, interacting with the standard model sector 
via gravitational (or perhaps additional gauge) 
interactions. Although explicit models with such 
hidden sectors and strong dynamics exist, they 
often result in runaway potentials for moduli. 
Racetrack scenarios where several condensates 
balance each other are possible but contrived. 

A second mechanism to break supersymmetry, 
mostly explored in type IIB/F-theory compactifica- 
tions, is the introduction of field-strength fluxes for 
p-form fields. Interestingly, such fluxes lead to 
nontrivial potentials depending on moduli, and 
generically breaking supersymmetry. The existence 
of several remnant flat directions in the leading a’, gs 
approximation, leaves unanswered the question of 
possible runaway moduli potentials in those direc- 
tions. However, evidence for nonperturbative con- 
tributions stabilizing the remaining moduli at finite 
distance has been proposed. Preliminary results in the 
analysis of flux stabilized vacua have been obtained 
in simple examples of (still unrealistic) Calabi-Yau 
compactifications with small number of moduli. 

Most explored mechanisms propose supersymmetry 
breaking below the Kaluza—Klein compactification 
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scale, and, therefore, can be described in the 4D 
effective theory. They can be nicely parametrized in 
terms of vacuum expectation values for the dilaton 
and geometric moduli of the compactification. This 
description allows for a computation of the soft 
terms using the expansion of the N — 1 supergravity 
formulas in components. Concrete patterns, such as 
the universality of squark masses, or the complex 
phases of diverse soft terms, can be explored using 
this approach. 

Alternative mechanisms of breaking supersymme- 
try at higher scales, such as the introduction of 
antibranes or nonsupersymmetric compactifications, 
lead to generic difficulties with stability. 

Related to the question of supersymmetry break- 
ing is the question of the cosmological constant. 
Unfortunately, there is no manifest mechanism in 
the string theory that explains the smallness of the 
observed value of this scale. Given that many 
aspects of both quantum gravity in the string theory 
and realistic model building (with proper super- 
symmetry breaking and moduli stabilization) are 
still under progress, an open-minded point of view 
on this problem and the proposed solutions is kept. 


Cosmology 


Although somewhat different from the traditional 
focus of string phenomenology, recent progress in 
observational cosmology has triggered much interest 
in string theory realizations of inflationary models 
(or alternatives such as pre-big bang scenarios). 
Most inflationary models have centered on using 
moduli as the inflaton field, due to their flat 
potentials. A simple setup in type II compactifica- 
tions, known as brane inflation models, uses the 
modulus controlling a brane position as the inflaton 
field, which has a flat enough potential with a 
moderate fine-tuning. Such setups may lead to 
interesting additional features, such as a moderate 
but potentially observable density of cosmic strings 
created in the reheating process. 

On the other hand, many interesting questions in 
string cosmology await further understanding of 
time-dependent backgrounds in the string theory. 


Retrospect 


It is remarkable that the formal framework of 
the string theory admits tractable solutions with 
reasonable resemblance to the structure of the 


standard model. In particular, generic features such 
as nonabelian gauge symmetry and chirality, coupled 
to gravity, are generic in 4D compactifications. This 
is already a success. In addition, much progress has 
been made in the general description of the relevant 
mathematical tools, and physical mechanisms and 
ingredients involved in these vacua, as well as in the 
explicit construction of models with the standard 
model spectrum (or supersymmetric extensions of 
it). Yet, many questions remain open and much 
more work is needed in order to make contact with 
the physics observed in nature. 


See also: Brane Worlds; Compactification of Superstring 
Theory; Cosmology: Mathematical Aspects; Superstring 
Theories. 


Further Reading 


Acharya B and Witten E (2001) Chiral fermions from manifolds 
of G(2) holonomy, hep-th/0109152. 

Aldazabal G, Ibánez LE, Quevedo F, and Uranga AM (2000) 
D-branes at singularities: a bottom up approach to the string 
embedding of the standard model. Journal of Higb Energy 
Physics 0008: 002. 

Angelantonj C and Sagnotti A (2002) Open strings. Physics 
Reports 371: 1-150. 

Angelantonj C and Sagnotti A (2003) Open strings — erratum. 
Physics Reports 376: 339—405. 

Antoniadis I, Arkani-Hamed N, Dimopoulos S, and Dvali GR 
(1998) New dimensions at a millimeter to a Fermi and 
superstrings at a TeV. Physics Letters B 436: 257-263. 

Bachas C (1995) A way to break supersymmetry, hep-th/ 
9503030. 

Blumenhagen R, Cvetié M, Langacker P, and Shiu G (2005) 
Toward realistic intersecting D-brane models, hep-th/ 
0502005. 

Candelas P, Horowitz GT, Strominger A, and Witten E (1985) 
Vacuum configurations for superstrings. Nuclear Physics B 
258: 46-74. 

Donagi R, He Y-H, Ovrut BA, and Reinbacher R (2004) The 
spectra of heterotic standard model vacua, hep-th/0411156. 
Green MB, Schwarz JH, and Witten E (1987) Superstring Tbeory. 
Cambridge Monographs On Mathematical Physics, vols. 1 

and 2. Cambridge: Cambridge University Press. 

Ibanez LE (1987) The search for a standard model SU(3) x 
SU(2) x U(1) superstring: an introduction to orbifold con- 
structions. Seoul Sympos. 1986, 46. 

Polchinski J (1998) String Theory. vols. 1 and 2. Cambridge: 
Cambridge University Press. 

Uranga AM (2003) Chiral four-dimensional string compactifica- 
tions with intersecting D-branes. Classical and Quantum 
Gravity 20: $373-S394. 

Witten E (1996) Strong coupling expansion of Calabi-Yau 
compactification. Nuclear Physics B 471: 135-158. 


String Topology: Homotopy and Geometric Perspectives 111 


| String Topology: Homotopy and Geometric Perspectives 


i 

| RL Cohen, Stanford University, Stanford, CA, USA 
| 

- © 2006 Elsevier Ltd. All rights reserved. 


String topology is a new field of study involving the 
geometric and algebraic topology of spaces of loops 
and paths in manifolds. The subject was initiated in 
the important work of Chas and Sullivan (1999) 
who uncovered previously unknown algebraic struc- 
ture in the homology and equivariant homology of 
loop spaces. While the structure is purely topologi- 
cal, it was motivated by formalisms in quantum field 
theory and string theory. Since that time this subject 
has attracted the attention of many mathematicians, 
but one of the main lines of research continues to be 
motivated by the attempt to understand the relation 
between this structure (and its generalizations) with 
topological and conformal field theories. 

In order to describe some of the recent advances in 
this field, we begin with some notation. Throughout 
this article M" will denote a closed, n-dimensional, 
oriented manifold. LM will denote the free loop space, 


LM — Map(S!, M) 


For D41,D; C M closed submanifolds, Pm(D1, D2) 
will denote the space of paths in M that start at D, 
and end at D», 


Py(Di, D2) = (y : [0,1] ^ M,7(0) € Di, y(1) € D2} 


The paths and loops we consider will always be 
assumed to be piecewise smooth. Such spaces of paths 
and loops are well known to be infinite-dimensional 
manifolds, and roughly speaking, string topology is the 
study of the intersection theory in these manifolds. 

Recall that for closed, oriented manifolds, there is 
an intersection pairing, 


H,(M) X H,(M) Hr+s—n(M) 


which is defined to be Poincaré dual to the cup 
product, 


H"-'(M) x H"^5(M) = H?"-T-s(M) 


The geometric significance of this pairing is that if 
the homology classes are represented by submani- 
folds, P" and Q? with transverse intersection, then 
the image of the intersection pairing is represented 
by the geometric intersection, PA Q. 

The remarkable result of Chas and Sullivan says 
that even without Poincaré duality, there is an 
intersection type product 


u: H(LM) x H¿(LM) ^ Hy, 4 (LM) 


that is compatible with both the intersection product 
on H,(M) via the map ev: LM => M(»y — y(0)), and 
with the Pontrjagin product in H,(QM). 

The construction of this pairing involves consid- 
eration of the diagram, 


LM — Map(8, M) £ LM x LM [1] 


Here Map(8, M) is the mapping space from the 
figure 8 to M, which can be viewed as the subspace 
of LM x LM consisting of those pairs of loops that 
agree at the basepoint. y: Map(8, M) ^ LM is the 
map on mapping spaces induced by the pinch map 
$! — S! y St, 

Chas and Sullivan constructed this pairing by 
studying intersections of chains in loop spaces. 
A more homotopy-theoretic viewpoint was taken 
by Cohen and Jones (2002) who viewed e:Map 
(84M) —^À LM x LM as an embedding, and showed 
there is a tubular neighborhood homeomorphic to a 
normal given by the pullback bundle, ev*(TM), 
where ev: LM — M is the evaluation map mentioned 
above. They then constructed a Pontrjagin- Thom 
collapse map whose target is the Thom space of the 
normal bundle, 7,:LM x LM — Map(8, My" TM) 
Computing 7 in homology and applying the Thom 
isomorphism defines an *umkehr map," 


e, : H.(LM x LM) 5 H,_»(Map(8, M)) 


The Chas-Sullivan loop product is defined to be the 
composition 


Hs = Vx 0€1 : H,(LM x LM) — H. ,(Map(8, M)) 
= H,_,(LM) 


Notice that the umkehr map e: can be defined for a 
generalized homology theory 5, whenever one has a 
Thom isomorphism of the tangent bundle, TM, 
which is to say a generalized homology theory 5, for 
which the representing spectrum is a ring spectrum, 
and which supports an orientation of M. 

By twisting the Pontrjagin- Thom construction by 
the virtual bundle —TM, one obtains a map of 
spectra, 


To: LM TM ALM ™ _, Map(8, M)'" CT) 


where LM^M is the Thom spectrum of the pullback 
of the virtual bundle ev*(— TM). Now we can 
compose, to obtain a multiplication, 


LM TM A LM ML Map(8, M)'C 79) 2, [MTM 


The following was proved by Cohen and Jones 
(2002). 
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Theorem 1 Let M be a closed manifold, then 
LM" is a ring spectrum. If M is orientable the ring 
structure on LM~"™ induces the Chas-Sullivan loop 
product on H,(LM) by applying homology and the 
Thom isomorphism. 


The ring structure on the spectrum LM^'M was 
also observed by Dwyer and Miller using different 
methods. 

Cohen and Godin (2004) generalized the loop 
product in the following way. Observe that the 
figure 8 is homotopy equivalent to the pair of pants 
surface P, which we think of as a genus 0 cobordism 
between two circles and one circle. 

Furthermore, Figure 1 is homotopic to the 
diagram of mapping spaces, 


LM Map(P, M) 45 (LMy 


where Pin and Pou are restriction maps to the 
“incoming” and “outgoing” boundary components 
of the surface P. So the loop product can be viewed 
as a composition, 


H= pp 
= (Pout). © (Pin), : (H.(LM)) *^ — H,(Map(P, M)) 
— H,(LM) 


where using the figure 8 to replace the surface P can 
be viewed as a technical device that allows one to 
define the umkehr map (p;,);. 

In general if one considers a surface of genus g, 
viewed as a cobordism from p incoming circles to q 
outgoing circles, X, 5,4, one gets a similar diagram 
(Figure 2) 


(LM)!  Map(Zep+g,M) > (LM)? 


Figure 1 Pair of pants P. 


= q circles 
p circles 


Figure 2 Xg, p+qg- 


Cohen and Godin (2004) used the theory of “fat” or 
“ribbon” graphs to represent surfaces as developed 
by Harer (1985), Penner (1987), and Strebel (1984), 
in order to define Pontrjagin-Thom maps, 
Tpi? (LM)? "E Map(Y, p.a. M) esa) 

where v(3X,.5,4) is the appropriately defined normal 
bundle of pin. By applying (perhaps generalized) 
homology and the Thom isomorphism, they defined 
the umkehr map, 


(Pin); : H,((LM)?) E H, 4y(S¢p.g)-n(Map(Xg p+; M)) 


where x(34,544) =2 — 2g — p — q is the Euler char- 
acteristic. Cohen and Godin then defined the string 
topology operation to be the composition, 


= pou © (pin), : H.((LM)") > Heyes 
x (Map(Xg,54.4, M)) > Herx(z 


gp tg )n 


X ((LM)*) 


N^ 
agp 4 q 
gp 


They proved that these operations respect gluing of 
surfaces, 


HS #22 PE, 9 By, 


where X,43:2X; is the glued surface as shown in 
Figure 3. 

The coherence of these operations is summarized 
in the following theorem. 


Theorem 2 (Cohen and Godin 2004). Let b, be 
any multiplicative generalized homology theory that 
supports an orientation of M. Then the assignment 


Egp+q > Higos IPA(LM)) > b. ((LM)?) 


is a positive boundary topological quantum field 
theory. “Positive boundary" refers to the fact that 
the number of outgoing boundary components, q, 
must be positive. 


A theory with open strings was initiated 
by Sullivan (2004) and developed further by 
A Ramirez (2005) and by Harrelson (2004). In this 


r circles 


i q circles 
p circles 


Figure 3 i74». 
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setting one has a collection of submanifolds, D; C M, 
referred to as “D-branes.” This theory studies 
intersections in the path spaces Py(Dj, D;). 

A theory with D-branes involves “open—closed 
cobordisms" which are cobordisms between com- 
pact one-dimensional manifolds whose boundary is 
partitioned into three parts: 


1. Incoming circles and intervals. 

2. Outgoing circles and intervals. 

3. The rest is the “free boundary" which is itself a 
cobordism between the boundary of the incom- 
ing and boundary of the outgoing intervals. Each 
connected component of the *free boundary" is 
labeled by a D-brane (see Figure 4). 


In a topological field theory with D-branes, 
one associates to each boundary circle a vector 
space Voi (in our case Vsi=H.(LM)) and to an 
interval whose endpoints are labeled by Dj, D;, one 
associates a vector space Vp, p, (in our case Vp, p, = 
H.(Pu( Di Dj))). 

To an open-closed cobordism as above, one 
associates an operation from the tensor product of 
these vector spaces corresponding to the incoming 
boundaries to the tensor product of the vector 
spaces corresponding to the outgoing boundaries. 
Of course, these operations have to respect the 
relevant gluing of open-closed cobordisms. 

By developing a theory of fat graphs that encode 
the open-closed boundary data, Ramirez was able 
to prove that there are string topology operations 
that form a positive boundary, topological quantum 
field theory with D-branes (Ramirez 2005). 

We end these notes by a discussion of three 
applications of string topology to classifying spaces 
of groups. 


Example 1 Application to Poincaré duality groups — 
(Abbaspour et al. to appear). For G any discrete 


Figure 4 Open-closed cobordism. 


group, one has that the loop space of the classifying 
space satisfies 


LBG~ [| BC, 
ie 


where [g] is the conjugacy class determined by 
g € G, and C, < G is the centralizer of g. 

When BG is represented by a closed manifold, or 
more generally, when G is a Poincaré duality group, 
the Chas-Sullivan loop product then defines pairings 
among the homologies of the centralizer subgroups. 
Abbaspour et al. describe this loop product entirely 
in terms of group homology, thus giving structure 
to the homology of Poincaré-duality groups that 
previously had not been known. 


Example 2 Applications to — 3-manifolds. 
(Abbaspour 2005). Let ¿:H,M—=H,(LM) be 
induced by inclusion of constant loops. This is a 
split injection of rings. Write H,(LM)=H,(M) O 
Am. We say H,(LM) has nontrivial extended loop 
products if the composition 


Am ® Am œ> H,(LM) 9 H,(LM) 5 H,(LM) 


is nontrivial. 

Let M be a closed, irreducible 3-manifold. In a 
remarkable piece of work, Abbaspour showed the 
relationship between having a trivial extended loop 
product and M being “algebraically hyperbolic.” 
This means that M is a K(7, 1) and its fundamental 
group has no rank-2 abelian subgroup. (If geome- 
trization conjecture is true, this is equivalent to M 
admitting a complete hyperbolic metric.) 


Example 3 The string topology of classifying 
spaces of compact Lie groups (Gruher (to appear) 
and of Gruher and Salvatore (to appear)). The goal 
of Gruher's work is to construct string topological 
invariants of LBGc EG x gG, where G acts on 
itself via conjugation. Ultimately, one would like to 
understand the relationship between this structure 
and the work of Freed (2003) on twisted equivariant 
K-theory, Kc(G) and the Verlinde algebra. 


The first observation in this program was to 
notice that the key ingredient in the forming of the 
Chas-Sullivan loop product is that the fibration 
ev: LM —^M is a fiberwise monoid over a closed 
oriented manifold. The fiber is OM, which has the 
usual Pontrjagin product. 

The following was 
Salvatore: 


proved by Gruher and 


Lemma 3 Let G —^E— M be a fiberwise monoid 
over a closed manifold M. Then E™ is a ring 
spectrum. 
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The following construction gives a large supply of 
examples of such fiberwise monoids over manifolds. 

Let G— P — M be a principal G bundle over a 
closed manifold M. We can construct the corre- 
sponding adjoint bundle, 


Ad(P) ='PxeG — M 


It is an easy observation that G— Ad(P) —^ M is a 
fiberwise monoid. 


Theorem 4 Ad(P) ' is a ring spectrum. This ring 
structure is natural with respect to maps of principal 


G-bundles. 


Let BG be classifying space of compact Lie 
groups. It is possible to construct a filtration of BG, 


Mi¡=>M= +: > M; C Miri - .& BG 


where the M;'s are compact, closed manifolds. An 
example of this is filtering BU(n) by Grassmannians. 

Let G— P;— M; be the restriction of EG — BG. 
By the above theorem one obtains an inverse system 
of ring spectra 


pm ha ps "EP mi p i p- Min ¿+ 


i+1 

Theorem 5 The homotopy type of this pro-ring- 
spectrum is a well-defined invariant of BG. It is 
referred to as the “string topology of BG.” 


Potential Application: Twisted K-theory 
and the Verlinde Algebra 


Let G be a connected, compact Lie group. Using the 
observation that the loop space of a classifying space 
is the classifying space of the loop group, 
L(BG)~B(LG), the string topology gives new 
structure on the classifying space of these loop 
groups. In particular, one has new structure on the 
K-theory of these classifying spaces. Now classical 
results of Atiyah and Segal suggest that K-theory of 
classifying spaces should be related to the representa- 
tion theory of the group. In this case, the representa- 
tion theory of loop groups has been widely studied 
and is very important in conformal field theory. 
Understanding the precise relationship between the 
string topology of the classifying space and 
this representation theory is an interesting area of 
current research. To motivate this, first recall that the 
loop space, LBG, has a well-known description as 


LBG=EG x14G 


where the right-hand side refers to the homotopy 
orbit space of the conjugation (or adjoint) action of 
G on itself. Thus, the homology H,(LBG) is the 
equivariant homology H°(G). Similarly, the 


K-theory K*(LBG) maps to the equivariant K-theory, 
K+(G). Now in recent work of Freed (2003) twisted 
equivariant K-homology, Kc(G) was shown to be 
isomorphic to the Verlinde algebra. This algebra is a 
space of representations of the loop group, LG. The 
multiplication in this algebra is the “fusion product,” 
coming from conformal field theory. One topic of 
current research is to understand the relationship 
between multiplicative structure coming from the 
string topology of BG, and this fusion product in the 
Verlinde algebra. More generally, the goal is to bring 
to bear the considerable calculational techniques of 
algebraic topology that are available in string 
topology, to understand the recently uncovered field 
theoretic structure of twisted K-theory (Freed 2003), 
and its applications to string theory. 
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Introduction 


Superfluidity has been known to exist since the 
1930s. This widespread phenomenon occurs in 
many-particle Bose and Fermi systems as different 
as liquid *He, liquid ?He, atomic gases like Rb and 
Li, atomic nuclei, pulsars and last, but not least, in 
metals, where the itinerant electrons may become 
superfluid. This article is devoted to a unifying 
theoretical description of Bose and Fermi super- 
fluidity. The mechanisms leading to superfluidity 
include Bose-Einstein condensation (BEC) and 
Bardeen, Cooper, and Schrieffer (BCS)-Leggett 
pairing correlations. We hope to be able to 
demonstrate why this fascinating phenomenon is — 
even roughly 80 years after its experimental discov- 
ery and its first theoretical explanation - still a 
subject of intensive research. 

The phenomenon of superfluidity is closely 
connected with the apparent lack of any measurable 
flow resistance, which scales with the shear viscosity 
of the fluid. Its complete absence implies that 
the system is frictionless moving with zero viscosity. 
The observation of superfluidity is usually precluded 
by the solidification of most liquids as the tempera- 
ture is lowered. Only systems with particularly 
light atoms (like the helium isotopes *He and *He) 
stay liquid down to the lowest temperatures. 
These systems are referred to as “quantum liquids,” 
since their liquid state is caused by the quantum- 
mechanical zero-point motion of the atoms. It 
should be noted that the Helium isotopes 
belong to two different kinds of elementary 
particles which can be distinguished by their 
statistics: *He is a spin-0 boson and *He a spin- 
1/2 fermion. 

In 1924, Satyendra Nath Bose and Albert Einstein 
proposed that below a characteristic degeneracy 
temperature Tg, a macroscopic number of bosons 
can condense into the state of lowest energy e, =0. 
In the 1930s, Fritz London and Heinz London 
showed that this so-called Bose-Einstein condensate 
can be described by a macroscopic quantum- 
mechanical wave function like the one for a single 
elementary particle, but with the probability density 
replaced by the density of the condensed particles. 
By the end of the 1930s, the experimental results of 
Allen, | Kamerlingh-Onnes, Keesom,  Kapitza, 


Miesener, Wolfke, and others accumulated the 
evidence that liquid *He undergoes a second-order 
phase transition at T, =2.17K to a state referred to 
as a superfluid, since the liquid could flow without 
any sign of a flow resistance. This superfluid state 
was interpreted in terms of Bose condensation of the 
^He atoms in the liquid (London 1938). 

In Figure 1 the P-T phase diagram of liquid ^He is 
shown with a normal liquid phase, a solid phase and 
the superfluid phase below the A-line at about 2 K. 

Fermions cannot condense in a way similar to the 
BEC, due to the Pauli exclusion principle. In 1957 
Bardeen, Cooper, and Schrieffer came up with their 
ingenious proposal that the superfluidity of the 
electron system (usually referred to as superconduc- 
tivity comes about through the formation of 
fermion pairs (quasibosons) in k-space in a spin- 
singlet state. In 1971, several superfluid phases of 
liquid ?He at a few mK were discovered by Lee, 
Osheroff, and Richardson at Cornell University. 
Experimental aspects connected with the spin 
degrees of freedom of the quantum liquid gave 
strong evidence for Cooper pairing of the ?He atoms 
in a spin-triplet state. In Figure 2 the zero-field P-T 
phase diagram of liquid ?He is shown with a normal 
(Fermi) liquid phase, a solid phase and the super- 
fluid A and B phases. 

Immediately after this discovery, Anthony 
J Leggett applied the BCS ideas to liquid ?He and 
introduced a generalized scheme, that allowed for 
triplet-pairing correlations. His theory turned out to 
describe a large variety of experimental results 
accurately. A new and exciting development set in 
when Bose-Einstein condensates were discovered for 
the first time in dilute gases of alkali atoms in 1995 
by Cornell and Wiemann et al. (Rb), Ketterle et al. 
(Na), and Hulet et al. (Li). 
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Figure 1 The phase diagram of liquid ^He. Courtesy of Erkki 
Thuneberg. 
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Figure 2 The phase diagram of liquid ?*He. Courtesy of Erkki 
Thuneberg. 


Boson and Fermion Degeneracy 


In what follows, the energy dispersion of Bose and 
Fermi systems is denoted as e, (free bosons/fermions 
would be represented by e,—b^k^/2m). A large 
number of bosons can occupy Bose quantum states 
IR), the average occupation is dictated by the Bose- 
Einstein distribution 


1 


"k = aw ikeT — 1 d] 


For Bose systems, the chemical potential is negative 
u= —kgTo and a is fixed by the condition 


1 / 1 


where the prime indicates the summation over 
excited states |k| » 0. In [2], Ar =»/y2xrmkgT 
denotes the thermal de Broglie wavelength which 
provides a criterion for the importance of quantum 
effects or degeneracy through 1217 > O(1). The Bose 
integrals B,(o) originate from the conversion of the 
momentum sum into an energy integral and read for 
parabolic dispersion: 


1 00 d g=] 00 e "9 
Bela) = c | Uc ME, 0 3 


eyta — 1 — e 


with B,(0) =C(0), P the Euler P-function and ¢ denot- 
ing the Riemann ¢-function. It is important to under- 
stand that in order to have a constant total density 
n, B3j5(o)) has to increase ex T-3/? in the same way as 
Aj. This is, however, impossible at all temperatures 
since the chemical potential of the Bose gas vanishes 
(a — 0) at a finite temperature Tg given by 


2nb* m 1% 
"mkg Fori 
for which nj. = P350) —0(3/2) 22:612 ..«. 


[4] 


In sharp contrast, fermions obey the Pauli exclu- 
sion principle, which states that only one fermion 
can occupy a quantum state |k,o) specified in 
addition by the spin projection ø. The average 
statistical occupation is given by the Fermi-Dirac 
distribution 


1 
fk = ele) / ks T + 1 [5] 


Figure 3 shows a comparison of Bose-Einstein 
and Fermi-Dirac momentum distributions ng plotted 
vs. ek. The chemical potential is shown for fermions 
only, up —kpTo is always positive and the total 
density can be expressed as 


1 2 
n= 2 fi = 33 he) [6] 


where the factor of 2 originates from the spin 
degeneracy. For parabolic dispersion, the Fermi 
integral reads: 


F,(a) = 


1 f dyy" T0 (u/ kg T) [7] 
0 eya 4 1 


l'(co) l'(o + 1) 

One recognizes that the degeneracy condition 
này! " > 1 corresponds to the limit T < Tp= 
(0)/Rg, which is connected with the formation of 
a “Fermi sea,” with u(0) = Eg the Fermi energy: 


b 2 
p (3n n’ = Ep [8] 


To summarize, quantum behavior in Bose and 
Fermi system sets in below the degeneracy tempera- 
ture T*, defined through #3, = O(1). For bosons, 
T* — Tg is the temperature at which the chemical 
potential vanishes, whereas for fermions T* — Ty is 
the Fermi temperature. 
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Figure 3 The Fermi and Bose momentum distribution. 


London Quantum Hydrodynamics 


For a general treatment of the quantum-mechanical 
origin of the equations describing Bose and Fermi 
superfluidity, it is convenient to introduce a para- 
meter y which describes single bosons (v=1) or 
Fermion pairs (v—2) of mass M=vm. The basic 
assumption (London 1938) is that the laws of 
quantum mechanics are applicable also to a macro- 
scopic number of single (v = 1) or composite (v = 2) 
particles of density p*/vm, the so-called condensate, 
which is represented by a macroscopic wave func- 
tion vr, t). Y has the property 


p'(r, t) 
UV 


v(r,t)v'(r,t) = i pe 72 


The dynamics of the condensate is governed by 
the Schrödinger equation 


202 
x - (72 2 [9] 


in which y represents the condensate’s chemical 
potential. After performing a Madelung transforma- 
tion (Madelung 1926): 


S 


2_ P 


y = aef. g=— 
Vm 


one arrives at two coupled hydrodynamic equations, 
the first of which reads 


] 10 
jm = PV, dim a 


Equation [10] can be interpreted as a continuity 
equation, which represents the conservation law for 
the condensate mass density p*. The second equation 


hop 1 


2 272 1 
Ld jm +u +O V^) [11] 


assumes the form of the Hamilton-Jacobi equation 
for the action field of classical mechanics hy, if the 
quasiclassical limit (terms x O(b*V2) — 0) is taken. 
From [10] and [11] a condensate acceleration 
equation can be derived, which resembles the Euler 
equation of classical hydrodynamics (u = jug + ĝu): 


Qv à -— 

3; V s dear ii [12] 
The physical nature of the driving force becomes 

evident after applying the Gibbs-Duhem relation 
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nôu =P —o9ó6T. Finally, the acceleration of the 
mass supercurrent jì, is of the form 


S 

a - - = V(6P — 6 6T) [13] 
It turns out that the London equations [10] and 
[13], in which p° is an unknown phenomenological 
parameter, explain many experimental observations 
such as persistent currents, U-tube oscillations, 
thermomechanical (e.g., fountain-) effects, beaker 
flow phenomena, and many others. 


Bose-Einstein Condensation (BEC) 


In order to understand the macroscopic quantum 
state in case of Bose systems, we consider first the 
simple case of a Bose gas. Let us decompose the 
energy eigenstates «, into those with e¿=e9=0 
(condensate) and average occupation number 
No 1 1] 

no = — Z — 
V Ver-i1 
and those with e, > 0 (excited states) and average 
occupation number 


Nex B3/2(@) [E 
Mex = —— = nic — 15 
V 2 i B32 (0) \ Tg Aa 


with the total density 7 = nex + no. The consequence of 
the chemical potential vanishing at Tg clearly is a mac- 
roscopic occupation of the ground state of the Bose gas: 


1 1 
$$ = — 16 
l+a+-..—1 id [16] 


[14] 


No a—0 
This phenomenon is referred to as BEC. Below 
Tg,œ=0 and from [15] we see that 


ree bosons T ne 
nex = L3 lg [17] 


The average occupation of the ground state is given by 
afT) -—m—naT) TT 118] 


It is important to understand that the number 
density of condensed particles nex has nothing to 
do with the current response function p* (eqn [10]). 
A derivation of p° will be given in the section “Local 
response of condensates and excitation gases." 

Let us now discuss the structure of the excitation 
spectrum, which will turn out to be crucial for the 
observability of superfluidity, in some more detail. 
Suppose that a macroscopic object of mass M moves 
through the superfluid. Then one may ask the question, 
at what velocity does this motion cause the creation of 
an excitation of energy Ey and momentum p. The 
condition can be formulated in terms of the velocity 
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difference vj—vp as Eg—M(wi— ve) /2 and 
p-—M(vi vs). Eliminating v; yields e; —p- vi + 
O(M ^) so that condition for the creation of an excit- 
ation leads to the so-called Landau critical velocity 

vi = min{ e} > 0 [19] 

[pl 

It is immediately clear that for free bosons vy =0. 
This means that a free Bose gas can never be a 
superfluid, since drag forces on moving objects will 
start to act even at smallest velocities. 

It turns out that interaction effects can drastically 
modify the nature of the elementary excitations. In 
1947, Nikolai Bogoliubov showed (for the first time 
using the method of second quantization) that even in 
the limit of weak repulsive interactions the excitation 
spectrum is phonon-like Ep = c|p|, with c the sound 
velocity. Lev Landau and Richard Feynman investi- 
gated the situation for superfluid ^He, where the 
interactions between the atoms are far from weak. 
Landau (1947) postulated the following form for the 
excitation spectrum, for which Feynman (1953) gave 
the microscopic justification. At low momenta, the 
spectrum is phonon-like and linear in p: 


. || pphon . 
lim Ep = Ej" = clp 20 


At higher momenta, the spectrum is reminiscent 
of that of crystal phonons in that Ep passes though a 
maximum, and then, at a characteristic momentum 
po approaches the next minimum, which, however, 
is located at a finite energy A. Feynman called this 
part of the spectrum the “roton” (mass m,) in an 
analogy with a “smoke ring," since it is connected 
with the forward motion of a particle accompanied 
by a ring of back-flowing other particles: 


(\p| — po)” 
lim. Es =E% = Ate I. 21 
Ip|—po P p 2m, | | 


Figure 4 shows a sketch of the phonon-roton 
spectrum of superfluid ^He. Clearly, the Landau 


Hotons 


Phonons 


0 Po p 


Figure 4 The phonon-roton spectrum. 


critical velocity for the phonon-roton spectrum is 
characterized by the roton minimum and is given by 
U| = A/po. 


BCS-Leggett Pair Condensation 


The key assumptions of the weak-coupling mean- 
field BCS—Leggett pairing model can be summarized 
as follows: one first assumes that at sufficiently low 
temperatures it is energetically favorable that a 
temperature-dependent part of the fermions forms 
so-called Cooper pairs. This pair formation is caused 
by an attractive interaction in k-space near the 
Fermi surface: 
Tj «0, [El l&l < e 

Here €; =€x—p measures the energy from the 
chemical potential. The index s denotes the total 
spin of the pair. Classical superconductors have 
pairs in a relative singlet state s — 0,77, — 0 whereas 
the superfluid phases of liquid ?He have pairs in a 
relative spin-triplet state s= 1,7, —0, +1, with ms 
the magnetic quantum number. The amplitude of 
spontaneous pair formation is 


Sko10; — ¿MA £ 0, I < To [22] 


with k=k; —k> the relative momentum of the 
pair. The attractive interaction that drives the 
Cooper-pair formation connects the pairing ampli- 
tude 24,,5,, with a new energy scale, the so-called 
pair potential 


Dessus = 2 ipa: Epoo [23] 
p 


As a consequence of triplet pairing the spin part of 
the pair potential is “even” upon interchange of c 
and 02: Akoso, =Aro,o,- Then the Pauli principle 
requires that A,,,,, must be “odd” with respect 
to the interchange of kı and k or, equivalently, 
k — —k. The k-dependence can now be classified by 
an orbital quantum number £ with the special cases 
of (— 1 (p-wave) pairing, / — 3 (f-wave) pairing, etc. 
All superfluid phases of *He are characterized by 
p-wave orbital symmetry. 
The transition temperature T. from [23] reads 


2e? qi) 
ka Tt. = — ege MITT) 
TU 


with Ny = 3n/2Ey the density of states at the Fermi 
level and y=0.577... the Euler constant. The 
energies £, can trivially be divided into particle-like 
(€ > 0) and hole-like (£, < 0) terms. The presence 
of the pair potential Az leads to a mixing of particle- 
and hole-like contributions to the energy, which 


becomes a matrix in particle-hole, or Nambu space 
(Nambu 1960), and generates what is referred to as 
off-diagonal long-range order (ODLRO): 


fel Ak 
Ol MEC [24] 
A, —&l 
As usual, the diagonalization of £, (Bogoliubov 
1958) leads to the energy dispersion of the relevant 


thermal excitations of the superfluid state, the so- 
called Bogoliubov quasiparticles or “bogolons”: 


E,Q— 4/& Aj Ag = Ap- A [2.5] 


In Figure 5, the dispersion Ej of Bogoliubov 
quasiparticles vs. |p| is shown. It turns out that the 
superfluid phases (A and B) of liquid ?He in zero 
magnetic field are characterized by unitary matrices 
Az, so that the scalar quantity A, can be interpreted 
as the energy gap in the bogolon spectrum, which, in 
general, may be anisotropic in k-space. 

The energy gap A, of the superfluid B-phase can 
be represented in the simple nodeless (pseudoiso- 
tropic) and BCS-like form (Balian and Werthamer, 
(BW), 1963): 

A(0) ~r 

M=A(T), ET S [26] 
Its spin structure is characterized by the presence of all 
three triplet components m,=0, +1 and will be 
discussed further with respect to the magnetization 
response (see next section). The gap symmetry of ?He-A 
is uniaxial with respect to an axis / (Anderson and 
Morel 1960; Anderson and Brinkman 1973) 


Ao(0) 23 me>/® 
kT. E 
where cos dj =k - £, and characterized by two point 


nodes of A, at the zeros (¢,=0,7) on the Fermi 
surface. It has furthermore turned out that only the 


A, = Ao(T) sin dg, 


[27] 
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Figure 5 The bogolon energy dispersion. 
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m,= +1 components of the spin triplet contribute 
to its spin dependence (equal spin pairing (ESP)). 


Local Response of Condensates 
and Excitation Gases 


In the previous sections we have seen that the 
structure (energy dispersion, statistics, critical flow 
velocity) of the relevant thermal excitations is of 
crucial importance for the superfluidity. We can 
now aim at a generalized statistical description of 
bosonic (phonons, rotons) and fermionic (bogolons) 
excitation gases, by introducing a generalized 
momentum distribution 


1 


no{ Ex} = ET 08 [28] 

and its energy derivative 

Ong{ Ex} 1 
a IU ee, CO 
PES OE, 2kgT[cosh(Ez/kgT) = 0] | | 
Special cases are 
p— 1, Bose (phonons, rotons) 
= |-1, Fermi (bogolons) 

Introducing the spin s=(1-—0)/4, the total 


momentum density response to the presence of a 
superfluid velocity 
bV 1 
A a , =0= 
(2s 4- 1)m 2 
and a normal fluid velocity v? can be written in the 
general form 


— Asl 
Im = V 


X pn {Ek +6Ek}+ vw [30] 
k 


After Taylor-expanding mg with respect to the 
small energy shifts 6E,=p-(v'—v"), one may 
introduce the so-called normal fluid density tensor 


= 254-1 
pi — V 2 Propio [31] 


and the momentum density assumes the form 
p=pl—p" [32 


Equation [32] forms the central result of this 
essay because it represents the microscopic counter- 
part of the generalized London equation [10]. It is 
clearly seen how the phenomenon of superfluidity 
originates from p° > 0 due to a qualitative change 
in the dispersion of the elementary excitations, 
which may in particular be characterized by a gap in 
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the excitation spectrum. Equation [32] is more general 
than [10] in that it introduces a two-fluid picture in 
which the mass supercurrent jò = p^» (eqn [10]) is 
complemented by a normal (excitation) mass current 
j,,=p"v" in the presence of a macroscopic velocity 
field v" of the excitation gas obeying arbitrary 
statistics. The temperature dependence of p*(T) can 
now be computed via [31] and the result depends on 
the dispersion of the thermal excitation under con- 
sideration. Figure 6 shows the temperature depen- 
dence of the normal and superfluid density of 
superfluid *He. The normal fluid density of superfluid 
?He is, in general, a tensor quantity 


n= 33 
p" bi, 3 He-B | | 


: dos + ph(6;— £1), 3He-A 
7 

The short-range Fermi liquid interaction leads to a 
quasiparticle mass enhancement m*/m-1--Fi/3 
characterized by the pressure-dependent dimensionless 
Landau parameter F;. In Figure 7, the normal fluid 
density (0j. | for ?He-A, p" for ?He-B) is shown as a 
function of reduced temperature at a pressure of 27 
bar, where Fi =12.53. The entropy density of an 
excitation system of arbitrary statistics below the 
transition can be written as 


25 +1 
m= tg LS Pus 
k 


V 
Po = 0(1 + Ong) In(1 + Ong) — fig In ny 


[34] 


with ng =no{E,}, from which one may derive the 
specific heat capacity 


(0 2s+la, [Et ROT 
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Figure 6 The normal and superfluid density for He-ll. 


Figure 7 The normal fluid density for 9He-A, B. 


After a Taylor expansion of ng with respect to the small 

local temperature change óT, the result for cy( T) reads 
2s 4-1 E OE, 

= EE 2E ; PRO e — Ek EJ [36] 

In Figures 8 and 9 we show the cusp-like specific heat 

of a Bose gas as compared with the specific heat of 

*He-A, B, which display discontinuities at T;. 

Finally, the superfluid phases of *He are char- 

acterized in addition by the spin degrees of freedom, 

reflected by the bogolon spin magnetization 
response to an external magnetic field B: 


n 1h 
M? = T {Ep —yboB/2)=x0B [37 


where ^ denotes the gyromagnetic ratio of the fermions. 
The bogolon spin susceptibility yo is obtained after a 
Taylor expansion of n with respect to B as 


Figure 8 The specific heat capacity of a Bose gas. 
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Figure 9 The specific heat of 9He-A, B. 


Note that eqn [38] accounts only for the m,=0 
(bogolon) contribution to the spin-triplet suscept- 
ibility, the temperature dependence of which is given 
by the so-called Yosida function Y(T) — Nj; 


Ys Yk 1. The total susceptibility reads 


Xot = Xo, +X1+X-1 [39] 
pm AAA 
bogolons condensate 


with the condensate contributing through Xm.=+1 a 
fraction of 2/3 of the normal state Pauli suscept- 
ibility. In Figure 10, the reduced spin susceptibility 
x/xu of ?He-A,B is plotted vs. reduced tempera- 
ture. While the constant susceptibility is character- 
istic of the ESP pairing state, the reduction of the 
B-phase susceptibility is due to the lack of the 
nonmagnetic 77, — O0 contribution to the spin triplet 
in the low-temperature limit. Exchange interaction 
effects, characterized by the dimensionless Landau 
parameter F5, lead to a further reduction of the 
Balian-Werthamer (BW)-state susceptibility, which 
is shown for 27 bar, where F? — —0.755. Note that 
the theoretical picture reflected in Figure 10, and 
also in Figures 6, 7, and 9, is in quantitative 
agreement with experimental observations. 

In summary, superfluidity is a quantum-mechanical 
phenomenon seen on a macroscopic scale. It occurs 
below the degeneracy temperature T* x n*/>/m of 
both Bose and Fermi many-particle systems (like liquid 
^He and ?He) and is a property of a macroscopic 
number of particles, the condensate. The role of (weak 
or strong) interactions is manifested in the structure of 
the relevant elementary excitations, which always exist 
in addition to the condensate at finite temperatures and 
above certain critical velocities. These excitations form 
a gas, referred to as the normal fluid, since it gives rise 
to temperature-dependent thermodynamic and response 
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Figure 10 The spin susceptibility of ?*He-A, B. 
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functions and contributes to the entropy and the flow 
dissipation. Superfluidity is now well understood using 
various aspects of the concept of the macroscopic wave 
function. On the microscopic level, the mechanisms of 
BEC and BCS-Leggett pair formation have been 
successfully invoked to understand the fascinating 
properties of Bose and Fermi superfluids. 


See also: Bose-Einstein Condensates; Bosons and 
Fermions in External Fields; High 7, Superconductor 
Theory; Topological Knot Theory and Macroscopic 
Physics; Variational Techniques for Ginzburg-Landau 
Energies; Vortex Dynamics. 
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Introduction: Minimal D — 4 Supergravity 


The essential idea of supersymmetry is an extension of 
the relativistic structure group of spacetime, which in 
ordinary four-dimensional physics in the absence of 
gravity is the Poincaré group ISO(3, 1). In a minimal 
supersymmetric theory in flat D —4 spacetime, the 
minimal supersymmetry algebra (the *graded Poincaré 
algebra”) adds spinorial generators O,, to the Lorentz 
generators M,,, and the translational generators 
(momenta) Pm, where m = 0, 1, 2, 3. The core relation 
is the *anticommutator" of two Qa: 


(Qa. Os) = = — 2703 m [1] 


where Q — Q4? and the y” are the Dirac gamma 
matrices. In the minimal D=4 supersymmetry 
algebra, the spinor generator O, is taken to be 
Majorana: Q— C(Q)!, where C is the charge- 
conjugation matrix and A! denotes the transpose 
of the matrix A. The full supersymmetry algebra 
adjoins to the anticommutation relation [1] the 
usual commutation relations among the Lorentz 
generators and the commutators of the Lorentz 
generators with the momenta and the spinors Qa; 
the latter express respectively the vectorial and 
spinorial characters of P,, and Qa: 


[Mi Mba] - nnp Mma TES Thnp M ng [2] 
IM, P,| = Ting?’ m - NingP n [3] 
i[M mn, y = 1 (205). [4] 


where Yan = (1/2 170 Ya = Yn Ym) and Thun = diag( E 
1,1, 1) is the Minkowski metric. The final relation 
in the supersymmetry algebra expresses the flatness 
of Minkowski space: 


[Pas PO] 2:0 [5] 


This algebra has been considered as an extension of the 
symmetry algebra of particle physics since the work of 
Gol'fand and Likhtman in 1971, and especially since 
the linearly realized supersymmetric model of Wess 
and Zumino in 1974. That model contains a pair of 
D — 4 scalar fields and a D —4 Majorana spinor, so 
the numbers of bosonic and fermionic degrees of 
freedom are equal; this is a fundamental characteristic 
of supersymmetric theories. 

The work of Wess and Zumino led to an 
explosion of interest in supersymmetry, especially 


once it was realized that renormalizable supersym- 
metric models display a cancellation of some of the 
divergences that have plagued relativistic quantum 
field theory since its inception in the 1930s. In 
particular, in renormalizable flat-space field theory 
models, divergences quadratic in a high-momentum 
cutoff vanish as a result of cancellations between 
virtual bosonic and fermionic particles. This is a 
very attractive feature for control of the “hierarchy 
problem" in particle physics, especially for the 
instability inherent in having vastly different scales 
within the same theory, for example, the TeV scale 
of ordinary electroweak physics and the 10!° GeV 
scale where unification with the strong interactions 
might come in. 

When one includes gravity, the stability problems 
of particle physics become much more severe. 
Einstein's theory of general relativity is itself non- 
renormalizable, that is, its ultraviolet divergences are 
of different forms from the terms present in the 
original “classical” action and there is no acceptable 
finite set of correction terms that can be added to it 
to remove this defect. Moreover, when otherwise 
tolerably behaved matter field theories that are 
renormalizable in a flat-spacetime context are 
coupled to general relativity, the gravitational 
couplings pollute the matter theories with non- 
renormalizable divergences. This is a key aspect of 
the great difficulty that has been encountered in 
interpreting gravity as a quantum theory. 

Supersymmetry, with its divergence-canceling 
powers, was thus a very attractive option in the 
struggle to formulate a quantum theory of gravity, and 
the creation of a supergravity theory was thus a very 
high priority task. This was achieved in 1976 by 
Freedman, Ferrara, and Van Nieuwenhuizen using the 
technique of iterative Noether coupling to build up this 
nonlinear theory order-by-order in powers of the 
fermionic fields. The fermionic partner of the massless 
spin-2 “graviton” field is a massless fermionic spin-3/2 
field that has come to be called the “gravitino.” 

A second 1976 paper by Deser and Zumino soon 
followed, emphasizing how supergravity manages to 
circumvent the well-known problems of coupling 
spins higher than 1 to gravity. A key point in 
achieving this result is the role played by the local 
version of the supersymmetry algebra [1]-[5]. As 
one can see from the translations occurring on the 
right-hand side of [1], when one replaces translation 
symmetry by local general coordinate invariance in a 
gravitational context, the supersymmetry transfor- 
mations must themselves become local as well. Local 
symmetries allow for transformation parameters 


that are local in the spacetime coordinates x”, and 
in interacting theories they require coupling of the 
corresponding “gauge field” to a conserved current. 
In the case of supergravity, the gravitino field Ys 
plays this gauge-field role, and its coupling to the 
conserved current of supersymmetry is the key to 
allowing a consistent coupling between the spin-2 
graviton and the spin-3/2 gravitino. 


The Minimal Supergravity Action 


The action for minimal supergravity in D=4 
dimensions can be written, using the vierbein 
formalism where the metric is expressed as a 
quadratic expression in a nonsymmetric 4x4 
vierbein matrix e^, gmn = e4,€?Nabs as 


1-3] dx det(e)R(e, w(e) + K(1)) 
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where k=vV8TG is the gravitational coupling 
constant, 
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is the usual vierbein formalism spin connection (in 
which e? „=ðme? and &"^ is the matrix inverse of 
€ma), and 


ix? — E 7 
Kin (V) = (UY? + Frm? — Vu V^) — [8] 


is the fermionic contorsion, an additional part of the 
covariant derivative D,,(e + K(v)) appearing in the 
action [6]. (Indices m,n are taken to be “world” 
indices while indices a, b are “tangent space” indices; 
one can convert from one type to another using the 
vierbein ef, and its inverse, e.g., Waa = e" Vu.) 

Keeping the terms in the action grouped as above 
using the nonstandard covariant derivative e4? + K^" 
is what has been called “1.5 order formalism”: this 
greatly simplifies the writing and analysis of the 
supergravity action [6]. In the action [6], one has the 
Ricci scalar R(e, w(e) + K(w)) written in terms of this 
generalized torsional spin connection. One may of 
course expand out all the w + K? combinations 
and write the nonlinear fermionic terms separately. 
Doing this produces a quartic term 


" oe - - 
La — [UP A uf (ub yate + 2 ba 1ptbc) 
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showing the highly nonlinear nature of supergravity 
theory — when expanded out, the theory becomes 
much more cumbersome to study. The 1.5 order 
formalism trick is one of a large number of algebraic 
simplifications that had to be developed in order to 
master the technical aspects of supergravity. It also 
reveals a characteristic physical feature: this theory 
naturally involves a connection with torsion built 
from the fermionic fields. 

In terms of the torsional covariant derivative 
Dmelx) = (Om + (1/4)(uH? (e) + KP (19) yav)e(x) of the 
infinitesimal supersymmetry parameter e(x), the 
local supersymmetry transformations which leave 
the action [6] invariant (up to the integral of a total 
derivative) are 


de”, = iy iy, [9] 
Sth, = 2 Due [10] 


The inhomogeneous part 2«7'0,,€ in the gravitino 
transformation [10] demonstrates the gauge-field 
nature of the gravitino field. For a distribution of 
“supermatter” fields (e.g., Wess-Zumino model 
scalars and spinors), the integrated "charge" that 
one would get from a Gauss's law surface integral at 
spatial infinity using the gravitino gauge field is the 
total supercharge Qa, which in turn plays the role of 
the supersymmetry generator in the original matter- 
sector supersymmetry algebra [1]. 

Both the gravitational field and the gravitino field 
are thus effectively gauge fields, albeit not of a 
standard Yang-Mills type. The local algebra is a 
deformation of the rigid supersymmetry algebra [1]- 
[5], generalizing the relation between general covar- 
iance and flat-space Poincaré symmetry. Some basic 
consequences of the flat-space algebra are preserved, 
however. An extremely important instance of this is 
energy positivity. As one can see by multiplying [1] 
by 4? and then contracting on the spinor index, 


E = p? = YO». Qu 


The right-hand side is manifestly non-negative 
provided the theory is quantized in a positive-metric 
Hilbert space. One can see this even more explicitly 
in a Majorana spinor basis, where Ol = Qa. 
Accordingly, for flat-space supersymmetric theories, 
one obtains directly the result that energy is 
non-negative. This carries over to the local algebra 
of supergravity, where the total energy is obtained 
from a Gauss’s law integral over the sphere at 
spatial infinity. 

In general relativity, an integrated energy can be 
defined with respect to an asymptotic timelike 
Killing vector at spatial infinity. Showing that this 
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energy is non-negative remained for decades a 
famously unsolved problem in gravitational physics; 
it was ultimately proven in Yau’s positive-energy 
theorem. The algebraic structure of supergravity 
makes energy positivity much more transparent, 
however. Since pure general relativity can be 
obtained by setting the gravitino field to zero, this 
result is inherited by pure Einstein theory as a 
consequence of its being embeddable into super- 
gravity. Energy positivity can thus be proved even at 
the classical level using ideas taken from super- 
gravity, as was done by Witten and later streamlined 
by Nester, in an argument much simpler than Yau's 
proof. This argument writes the energy as an 
integral over a  positive-semidefinite expression 
quadratic in a commuting spinor field which is 
analogous to the (anticommuting) spinor parameter 
of supergravity in the transformations [9] and [10]. 


Auxiliary Fields and Superspace 


Supergravity shares with flat-space supersymmetric 
theories a curious technical feature that gives a hint 
of a new underlying geometry. Standard counting of 
the gauge-invariant continuous degrees freedom of 
the graviton and the gravitino in momentum space 
yield the same result per momentum value: two 
bosonic degrees of freedom and two fermionic 
degrees of freedom. This accords with the general 
requirement in supersymmetric theories that the 
numbers of bosonic and fermionic degrees of free- 
dom match. This count follows from the Einstein 
and spin-3/2 equations of motion, or “on-shell.” 
If one compares the count of nongauge degrees 
of freedom without using the equations of motion 
(i.e., *off-shell"), one obtains an imbalance, how- 
ever: six nongauge graviton versus 12 nongauge 
fermion fields. This is directly related to another 
puzzling feature of the supergravity realization of 
local supersymmetry: the local supersymmetry alge- 
bra closes onto a finite set of transformations only 
when the equations of motion are imposed. 

As in flat-space supersymmetry, the cure for this 
problem is to add nondynamical “auxiliary” fields 
to the action. In the supergravity case, the 
imbalance in the off-shell bose-fermi field count 
indicates that an additional six bosonic fields are 
needed. In the minimal set of auxiliary fields, these 
organize into a vector b,, and a scalar-pseudoscalar 
pair M, N; the additional terms in the action [6] are 
simply 


while the local supersymmetry transformations are 
changed to include the auxiliary fields, e.g., the 
gravitino transformation becomes 


5m =2K 1 D,(w, K)e 
+ (by, "- ym" bn) € = rm (M + y5N)e) 


while the auxiliary fields transform into expressions 
that vanish on-shell. Since the field equations for the 
auxiliary fields are algebraic in character and since 
for source-free supergravity they have the simple 
solution b,, = M = N — 0, one can directly regain the 
on-shell formalism by algebraically eliminating the 
auxiliary fields. 

The inclusion of auxiliary fields is not an empty 
trick, however. The local supersymmetry transfor- 
mations including the auxiliary fields form a closed 
set without the use of equations of motion (“off- 
shell closure"). This standardizes the form of the 
supersymmetry transformations so that they remain 
the same even when supermatter is coupled to 
supergravity instead of needing a case-by-case 
Noether construction as in the case without the 
auxiliary fields. In this way, a standard set of 
coupling rules can be drawn up, known as the 
“tensor calculus.” This tensor calculus is of great 
importance as it allows for the construction of 
general models of supergravity coupled to super- 
matter (Wess-Zumino multiplets and super Yang- 
Mills multiplets consisting of spin-1 gauge fields and 
spin-1/2 *gaugino" fields). These general couplings 
form the basis for essentially all supersymmetric 
phenomenology, and in particular for the formula- 
tion of the Minimal Supersymmetric Standard 
Model. Since supersymmetry is not directly observed 
in low-energy physics, it must be spontaneously 
broken, like many other gauge symmetries. As it 
happens, the physically realistic mechanisms of 
supersymmetry breaking all originate from super- 
gravity couplings derived using the tensor calculus. 

Given the regular set of tensor calculus rules for 
coupling supergravity to supermatter, one is led to 
suspect that a geometrical structure lies in the 
background. This is indeed the case; the correspond- 
ing construction is known as “superspace.” 

The basic idea of superspace is a generalization of 
the coset space construction of Minkowski space as 
the coset space given by the Poincaré group divided 
by the Lorentz group: Ma(x”) —ISO(3, 1)/SO(3, 1). 
For supersymmetric theories, one analogously con- 
structs Superspace(x"', 0^) — Graded Poincaré/SO(3, 1). 
The basic ideas of superspace were introduced by 
Akulov and Volkov in 1972, while the idea of 
expanding in “functions” on this space, thus yielding 
“supertield,” was introduced by Salam and Strathdee 
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in 1974. This led to a formulation of the Wess- 
Zumino model in terms of a chiral superfield (x, 0), 
which is subjected to a covariant superspace 
constraint. 

In order to manage the formalism of superspace 
more efficiently, it is convenient to use a two- 
component spinor formalism corresponding to the 
Weyl basis for the Dirac gamma matrices, in which 
the Majorana spinor coordinate @ is represented as 


o={ 
0 


where two-component indices a, — 1,2 are raised 
and lowered with the covariant two-index antisym- 
metric tensors c?^?, €??, which both take the numer- 
ical value io2. The flat-space fermionic covariant 
derivatives are then 


D. pauta. 
002 al 
| "EN [11] 
D, = — > + 10 86 Om 
00 


where the o”, = (1,0j) for m = (0, i) (where oc; are the 
Pauli matrices) are the Van der Waerden matrices 
which establish the mapping between vector indices 
and (chiral, antichiral) spinor index pairs. The 
Wess-Zumino multiplet is then described by a 
complex chiral superfield satisfying the constraint 
Do — 0. Unlike the situation in Minkowski space, 
where the only Lorentz-covariant solution to a 
constraint that sets to zero the 0/Ox” derivatives is 
a constant, superspace has a reducible set of 
coordinates (x”, 6^, 0^) and, as a result, requiring ó 
to be annihilated by Da does not require the whole 
superfield to be a constant. 

Since the fermionic coordinates of superspace 
0^. 0^ are anticommuting (i.e., they are elements of 
a Grassman algebra), and since o, à — 1,2 have an 
index range of two, powers of them higher than the 
second order necessarily vanish. As a result, super- 
fields like @ can be expanded into sets of component 
fields, each of which is an ordinary field in 
Minkowski space. In this way, a chiral superfield 
expands into (A(x), B(x), xa(x), Xa(x), F(x), G(x)), 
where the fields A, B, x, and x are the physical 
fields of the Wess-Zumino model, while F and G 
are dimension-2 auxiliary fields. In this way, the 
auxiliary fields of supersymmetry naturally fit into a 
superspace formalism as higher components in a 
superfield expansion. It is in this sense that they 
point toward the superspace formulations of super- 
symmetric theories. 

For supergravity, there are a number of different 
approaches to realizing the theory in superspace, 


and these correspond naturally to the various 
possible choices of auxiliary-field sets. With the 
minimal set, the supergravity multiplet is described 
by a superfield carrying a vector index H,,(x, 0, 0); 
this superfield is called the prepotential of super- 
gravity. Note the fact that since the divisor group in 
the coset-space construction of superspace is the 
Lorentz group, superfields may carry indices corre- 
sponding to any Lorentz representation. The com- 
ponent-field expansion of the H,, superfield yields 
the physical e%,, Uma, Uma and auxiliary fields 
(bm, M, N) together with a number of other compo- 
nents of dimension lower than those of the physical 
fields. This is not, however, all that surprising: even 
the physical fields e, Wma, Yma contain components 
that are not directly related to the physical modes 
because we are dealing with a gauge theory. What 
occurs in superspace is a redundant expression of 
the supergravity multiplet with the presence of 
various component gauge fields. 

The full expression of local supersymmetry in 
superspace can be given in a number of different 
formalisms. Suffice it here to indicate the transfor- 
mation of the linearized theory expanded in small 
fluctuations about empty flat superspace. Convert- 
ing the vector index of H,, into a (chiral, antichiral) 
spinor index pair via H,, 77 .H,,, the linearized 
local symmetry transformation | of the supergravity 
multiplet is 


6H, = DaL; — D;La [12] 


where the transformation parameter superfield La 
carrying a spinor index is antichiral: D,L;=0 
(while the conjugate parameter superfield L, is 
chiral). Expanding in component fields and compar- 
ing with the expansion of H,,, one sees that the 
chiral spinor superfield contains precisely the com- 
ponents needed to provide the standard gauge 
symmetries of el, and Yma, Yma and also to trans- 
form the other gauge components of H,, as well. 
One can then make various gauge choices according 
to taste in a given context. 

One frequently encountered superspace gauge 
choice sets to zero all the fields in H,, except for 
the physical and auxiliary fields (e, Umas Uma» 
bm, M, N). This is called a Wess-Zumino gauge 
following the analogy to a similar construction for 
super Maxwell theory (containing spins 1 and 1/2). 
Wess-Zumino gauge choices are not, however, 
supersymmetrically covariant. This shows up when 
one works out the supersymmetry algebra in such a 
gauge: the presence of auxiliary fields gives closure, 
as required, without use of the equations of motion, 
but the anticommutator of two supersymmetry 
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transformations when acting on a gauge field such 
as the Maxwell field or the vierbein gives a 
combination of the anticipated translation with an 
admixture of a gauge transformation with a field- 
dependent parameter. 

The prepotential superfield of minimal super- 
gravity can itself be fit into larger formalisms in 
superspace that are analogous to standard differen- 
tial geometry, with supervielbeins, superspin con- 
nections and so forth. An unavoidable feature of 
these more seemingly geometric constructions, how- 
ever, is their high degree of redundancy: superspace 
vielbeins and spin connections carrying Lorentz 
indices have many component fields in addition to 
those found in the prepotential. This redundancy is 
then cut down in turn by imposing superspace 
constraints on the geometrical superfields, for 
example, on the components of the torsion tensor 
in superspace. 


Extended Supergravities and 
Supergravities in Higher Dimensions 


The possible graded extensions of the Poincaré 
algebra allow for more than one spinorial generator. 
Thus, one can have N supersymmetry generators 
Qs ij = 1... N, with basic anticommutators 
(in Lorentz two-component notation) 


(Q4. O5) = 260", [13] 
(Qi. Q^) = 2eapa iZ, [14] 
(Qai. Os) A 2e ¿Aj Zi [15] 


The right-hand sides of [14] and [15] allow for the 
possibility of nonvanishing commutators between 
supersymmetry generators of the same chirality. As 
one can see from the overall symmetry in pairs of 
indices (ai, Bj), the coefficients a^^ must be antisym- 
metric in the i, j indices, so such nonvanishing same- 
chirality anticommutators cannot occur for N — 1. 
The corresponding abelian generators Z; are called 
central charges since they must commute with all the 
other (O^, O j, ¡» Pm) elements of the algebra. 

The 1,7 indices may be endowed with a symmetry 
meaning as well, although this is not obligatory in 
every model. When the central charges are absent, 
Z, —0, one has U(N) (or SU(N)) as the maximal 
such external automorphism; the choice of index 
placement on O”, and Os anticipates this. If such a 
symmetry is realized in a given model, the fact that 
the Q', Os; carry representations both for that 
symmetry and for the spacetime Poincaré symmetry 
demonstrates how supersymmetry evades the no-go 


theorem barring unified spacetime and internal 
symmetries. This theorem (the Coleman—Mandula 
theorem) can be evaded, since at the time it was 
written, graded Lie symmetry algebras were not yet 
considered. For nonzero central charges, the exter- 
nal automorphism algebra becomes a subalgebra of 
U(N) determined by the requirement that invariant 
antisymmetric tensors a" exist. 

The representations of the algebra [13]-[14] span 
an increasing range of spins as the number N of 
D — 4 supersymmetries increases. For massive repre- 
sentations without central charges, the spins of the 
smallest supersymmetry representation extend from 
states of spin 0 (scalars) up to spin N/2; with central 
charges, the spin range can be shortened down to a 
minimum range of N/4. For massless representa- 
tions, the range of helicities in a PCT (parity- 
change-time reversal) symmetric multiplet is from 
—N/4 to N/4. This spin range has an important 
implication for the maximal extension of super- 
symmetry that can be realized in an interacting 
supersymmetric field theory, because no interacting 
theories with a finite set of spins exist for spins » 2. 
Accordingly, the maximal extension of supersym- 
metry is N =8 for massless theories, and in order to 
have massive states with spins that do not exceed 
spin 2 in an N— 8 theory, the central charges have 
to be active for maximal multiplet shortening. 

The N=8 supergravity theory, found by Crem- 
mer and Julia in 1978, is thus the largest possible 
supergravity in D=4 dimensions. It contains the 
following “spin” range (allowing for a certain 
imprecision of expression: for massless fields one 
should really speak only of helicities) 


N — 8 supergravity spins 


Ce [2 3p ipe 


Cwpisy| 3| & | 26 | 56 | 70 


In order to realize the automorphism SU(8) symme- 
try, one has to consider the field strengths for the 28 
spin-1 fields, separated into complex self-dual and 
anti-self-dual parts in their antisymmetric Lorentz 
indices. These complex field strengths can then be 
endowed with a complex 28-dimensional represen- 
tation of SU(8). The 70 scalars, on the other hand, 
fit precisely into the four-index antisymmetric 
self-dual representation of SU(8), 


hits > 


index epsilon tensor here that restricts the auto- 
morphism group to SU(8) instead of U(8). 

The SU(8) automorphism symmetry of N=8 
supergravity theory is linearly realized. It plays an 
important role in another symmetry of this theory 
which is highly nonlinear. This theory has a 


remarkable nonlinear E7 symmetry. In fact, the 70 
scalars form a nonlinear sigma model with the fields 
taking their values in the coset space E7/SU(8) (of 
dimension 133 — 63 — 70), where the SU(8) divisor 
is the linearly realized automorphism group dis- 
cussed above. 

The extended supergravities point to another 
aspect of supergravity theory: the existence of 
higher-dimensional supergravities, from which the 
extended theories in D =4 spacetime can be derived 
by Kaluza-Klein dimensional reduction. If one 
considers a D' dimensional massless theory in a 
spacetime where d dimensions form a compact 
d-torus, then the theory can be viewed as a D = D' — d 
dimensional theory in which the discrete Fourier 
modes arising from the periodicity requirements on 
the d-torus give rise to towers of equally spaced 
massive Kaluza-Klein states, plus a massless sector 
in D' — d dimensions corresponding to the modes 
with no dependence on the d-torus coordinates. 

Importantly, N=8 supergravity in four- 
dimensional spacetime can be obtained in this way 
from a supergravity theory that exists in 11 space- 
time dimensions. Upon dimensional reduction on a 
7-torus to four dimensions, one obtains N = 8, D = 4 
supergravity at the massless level, plus an infinite 
tower of massive N = 8 supermultiplets with central 
charges so that their spin range extends only up to 
spin 2. This D =11 supergravity was in fact found 
before the N=8 theory by Cremmer, Julia, and 
Scherk, with the details of the more complicated 
N=8,D=4 theory being worked out via the 
techniques of Kaluza—Klein dimensional reduction. 
The fields of the D=11 theory include an exotic 
field type not encountered in D=4 theories: the 
bosonic fields of the theory comprise the graviton ef, 
plus a three-index antisymmetric tensor gauge field 
CyNp. Counting the number of propagating modes 
of these fields for a given momentum value gives 
44 + 84=128 bosonic degrees of freedom. This 
precisely balances the 128 fermionic degrees of 
freedom coming from the D — 11 gravitino wo. 


Supergravity Effective Theories, Strings 
and Branes 


The hope for a cancellation of the ultraviolet 
divergences in a supersymmetric theory of gravity 
turned out to be ephemeral, although there is in fact 
a postponement of the divergence onset until a 
higher order in quantum field loops. There is 
agreement that the nonmaximal supergravities 
diverge at the  three-loop order. For the 
N —8,D —4 theory, the situation remains unclear, 
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but divergences are nonetheless expected to occur at 
some finite loop order. 

This persistence of nonrenormalizability in D — 4 
supergravity theories is no longer seen as a disaster, 
however, because these theories are now seen as 
effective theories for the massless modes arising 
from a deeper microscopic quantum theory. In 
addition, the theories that are most directly con- 
nected to this underlying quantum theory are, 
surprisingly, the maximal supergravities in space- 
time dimensions 10 and 11. D — 11 supergravity can 
be dimensionally reduced on a 1-torus (i.e., a circle) 
to D— 10 where the massless sector yields type IIA 
supergravity theory. This theory is the effective 
theory for a consistent quantum theory of type IIA 
superstrings in D=10. Theories of relativistic 
strings (i.e., one-dimensional extended objects) 
have strikingly different properties from theories of 
point particles. In particular, the spread-out nature 
of the interactions leads to a damping out of the 
quantum field theory divergences, while the under- 
lying supersymmetry causes a cancellation of other 
infinities that could have arisen owing to the two- 
dimensional nature of the string world sheets. This 
gives, for the first time, a perturbatively well-defined 
quantum theory including gravity. 

In addition to the type IIA theory, there are four 
other consistent superstring theories in D — 10, and 
these are in turn related to various D — 10 super- 
gravity effective theories for the massless modes: 
type IIB, Eg x Eg heterotic, SO(32) heterotic, and 
SO(32) type I. Remarkably, the maximal D— 11 
supergravity enters into this picture as well, as a 
consequence of a pattern of duality symmetries that 
have been found among the superstring theories. 

The dualities of string theory are directly related 
to the nonlinear symmetries of the dimensionally 
reduced supergravities in D — 4. The string quantum 
corrections do not respect the Ez symmetry of the 
classical N — 8 theory, but they do respect a discrete 
subgroup of this symmetry in which the E7 group 
elements are required to take integer values: E7(Z). 

This quantum-level restriction to a discrete sub- 
group can be seen from another phenomenon 
characteristic of superstring theories: the existence 
of "electric" and “magnetic” brane solutions. The 
antisymmetric-tensor (or “form”) fields of the 
higher-dimensional supergravities naturally give rise 
to solitonic solutions in which p--1 dimensions 
form a flat Poincaré invariant subspace. This can be 
interpreted as the world volume of an infinite 
p-brane extended object. In the D — 11 supergravity 
theory, the branes that emerge in this way are a 
2-brane and a 5-brane. The three-dimensional world 
volume of the 2-brane naturally couples to the 
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3-form field Cunp, just as an ordinary Maxwell 
vector field couples to the one-dimensional world 
line of a point particle (or O-brane). The 2-brane is 
thus naturally electrically charged with respect to 
the 3-form field; its charge can be obtained, in a 
direct generalization of the Maxwell case, from a 
Gauss’ law integral of the field strength Hj4j = 4C;3 
over a 7-sphere at spatial infinity in the eight 
directions transverse to the brane worldvolume. 
The S-brane, on the other hand, has a magnetic 
type charge; it is the 7-form dual to Ha; that is 
integrated to give its charge. In addition to these 
static infinite p-branes, the theory contains dynami- 
cal finite-extent branes as well, although for these 
one generally does not have explicit solutions. 

As one reduces a higher-dimensional supergravity 
to lower and lower dimensions, there is a proliferation 
of solitonic brane solutions of varying dimensionality, 
and of both electric and magnetic charge types. In a 
quantum theory context, these electrically and magne- 
tically charged branes pair up in ways that must satisfy 
a generalization of the Dirac quantization condition 
for D=4 electric and magnetic point particles. This 
ends up requiring all the supergravity solitonic brane 
charges to lie on a charge lattice. It is the requirement 
that this discrete brane-charge lattice be respected that 
restricts the classical supergravity nonlinear symmetry 
groups to discrete duality subgroups. 

The dualities relate brane solutions within a given 
theory and also between different string theories. 
They include transformations that invert the radii of 
compactifying tori, giving a large-small compactifi- 
cation scale duality. They also include transforma- 
tions that invert the string coupling constant, thus 
interchanging strong and weak coupling. The type 
IIB theory, for example, is self-dual under strong- 
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Introduction 


A supermanifold is a generalization of a classical 
manifold to include coordinates that are in some 
sense anticommuting. Much of the motivation for 
the study of supermanifolds comes from super- 
symmetric physics, where it is useful to have a 
formalism which treats fermions and bosons in the 
same way. The underlying reason for the 


weak coupling duality. In the case of the type IIA 
theory, however, something remarkable happens. 
The strong coupling limit of this theory turns out to 
be related by duality, not to another string theory, 
but to the maximal D — 11 supergravity. The role of 
the Kaluza-Klein massive modes for the 11 to 10 
reduction is played by an infinite tower of extremal 
charged black holes. 

Thus, even D — 11 supergravity theory has a role 
to play in the effective theory of the underlying 
quantum dynamics. This underlying theory has been 
dubbed “M-theory.” It is still only partially under- 
stood, but many of its most important properties are 
presaged by the remarkable nonlinear structure of 
the classical supergravities. 
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Supermanifolds; Superstring Theories; 

Supersymmetric Particle Models; Symmetries 

and Conservation Laws; Symmetries in Quantum 

Field Theory: Algebraic Aspects. 


Further Reading 


Buchbinder JL and Kuzenko SM (1998) Ideas and Metbods of 
Supersymmetry and Supergravity. Bristol: IoP Publishing Ltd. 

Stelle KS (1998) BPS branes in supergravity, Trieste 1987 School of 
High-Energy Physics and Cosmology, arXiv:hep-th/9803116. 

Van Nieuwenhuizen P (1981) Supergravity. Physics Reports 68: 
189-398. 

Wess J and Bagger ] (1983) Supersymmetry and Supergravity. 
Princeton: Princeton University Press. 


effectiveness of supermanifolds is that anticommut- 
ing coordinates allow the fermionic canonical anti- 
commutation relations to be handled in a way 
analogous to the bosonic canonical commutation 
relations. Supersymmetric methods have proved 
immensely effective in fundamental physics; they 
also play a considerable role in geometrical index 
theory in mathematics. In this article we describe 
supermanifolds from two points of view — geometric 
and algebraic — and consider some of the standard 
features of manifold calculus, including integration 
since this is an area where the distinctive features of 
this generalized geometry are particularly apparent. 


One situation where supermanifolds are used in 
physics is in the superspace formulation of super- 
gravity, where the physical fields are found in the 
component fields in the Taylor expansion of func- 
tions on the supermanifold in anticommuting vari- 
ables. More fundamentally, the symmetry groups of 
supersymmetric theories have commuting and anti- 
commuting generators, and are examples of super Lie 
groups, which are supermanifolds with a compatible 
group structure. 


Some Algebraic Preliminaries 


The coordinates of a supermanifold have particular 
algebraic features which are best understood by 
introducing some of the basic concepts of super- 
algebra. (The word super here does not imply 
superiority, simply the extension of some classical 
concept to have odd as well as even, anticommuting 
as well as commuting, elements.) A “super vector 
space” is a vector space V together with a direct sum 
decomposition 


V = Vy @ V4 [1] 


The subspaces Voy and V, are referred to, respec- 
tively, as the even and odd parts of V. A general 
element v of V thus has the unique decomposition 
v—vg--v4 with vo in Vo and vı in Vi. We will 
normally consider homogeneous elements, that is, 
elements v which are either even or odd, with parity 
denoted by |v|, so that |v| 2 i if v is in V;,i=0, 1. 
(Arithmetic of parity indices ;—0,1 is always 
modulo2.) A superalgebra is a super vector space 
whose elements can be multiplied together in such a 
way that the product of an even element with an 
even element and that of an odd element with an 
odd element are both even, while the product of an 
odd element with an even element is odd; more 
formally: 


Definition 1 


(i) A “superalgebra” is a super vector space 
A=Ag @ A, which is also an algebra which 
satisfies A;A; C Ajj- 

(11) The superalgebra is *supercommutative" if, for 
all homogeneous a,b in A,ab= Calida. 


If the algebra is supercommutative then odd 
elements anticommute, and the square of an odd 
element is zero. The basic supercommutative super- 
algebra used is the real Grassmann algebra with 
generators 1, 51, 55,... and relations 


18; = 011 = B;, Bibi = —G 6; [2] 
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A typical element of this algebra is then 


a — ag + HJ aif; + Hj» aij Dip; > -- [3] 


i<j 


This algebra, which is denoted Rs, is a superalgebra 
with Rs:=Rso Y Rs1, where Rs consists of linear 
combinations of products of even numbers of the 
anticommuting generators, while Rs; is built simi- 
larly from odd products. 

The Grassmann algebra Ry is used to build the 
(m, 1)-dimensional superspace R$” in the following 
Way: 


Definition 2. An (m,n)-dimensional superspace is 
the space 


Rs” = Rso x +- xX Rso X Rsı x -x Rsı [4] 
— i S Ó——áÓ 


m copies m copies 


A typical element of R” is written as 
(xl,...,x";£.,....,£"), where the convention is 
used that lower case Latin letters represent even 
objects and lower case Greek letters represent odd 
objects, while small capitals are used for objects of 


mixed or unspecified parity. 


As will be described in more detail below, in the 
geometric approach supermanifolds are spaces 
locally modeled on R$". In order to define a 
supermanifold, we will need to define a topology 
on this space, and to have some notion of 
differentiation. Consider first multilinear functions 
of purely anticommuting variables. If there are 7 
such variables, €*,...,€”, then a multilinear function 
F can be expressed in the form 


KE o sul) —R- > rei + » EE El ++ 
j=l 


l=1<j 


Ha E [5] 


where the coefficients Fy,F; and so on are real 
numbers. Such functions will be known (anticipating 
the terminology for functions of both odd and even 
variables) as supersmooth. (A useful notation will be 
to write 


KE,...,£") = N rue 6| 


with y a multi-index p=py---p, and & = 
£^ ...£"*1, The set of multi-indices is restricted to 
those where 1 € pı <--> < uk € n.) More general 
supersmooth functions, with the coefficients Fp,... 
taking values in C, Rs, or some other algebra are 
also possible. 
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Differentiation of supersmooth functions of anti- 
commuting variables is defined by linearity together 
with the rule 


(££... e) 


OE 
= l (DIE E E Ej py 7 
0 otherwise 


where the caret ^ indicates an omitted factor. 

In order to extend the notion of supersmoothness 
to functions on the more general superspace RS", 
we should strictly take note of the fact that an even 
Grassmann variable is not simply a real or complex 
variable, as explained in the appendix. Assuming 
this done, a supersmooth function on the general 
superspace R¿”” can then be defined as a function of 
the form 


PU eo Eee) m np..." (Sl 
H 


with each coefficient function r, a smooth function 
on R”. 

The final preparatory idea needed is the topology 
on the superspace R$^". It turns out that a coarse, 
non-Hausdorff topology leads to most of the super- 
manifolds used in physics. In order to define this 
topology, we introduce a mapping 


e: Rs >R 
defined by 
a+ Dar + Da) =a oc 
i i<j 


and the related mapping 
e: Ro” — R” 
defined by 


EEEE O 2 EEG E II E — [10] 

These maps project out all the nilpotent Grass- 
mann generators, leaving simply the real part. The 
topology involves the inverse of these projection 
maps: a subset U of R$" is said to be open if and 
only if there exists an open set V in R" such that 
U-—c (V). Thus, an open set is unlimited in the 
nilpotent directions. 

In the sequel, where we consider integration, the 
superdeterminant of the matrix M of an endo- 
morphism of a super vector space V will be useful. 
If V is an (m,n)-dimensional super vector space 


(so that Vo has dimension m and V, dimension n), 
then M will have the block diagonal form 


fro sk 
Mio Mii 


where the entries of Moo and M11 are even, whereas 
those of M49 and Mio are odd. If N — M^! has block 


form 
[s Noi ) 
Ni Ni 


then the superdeterminant of M is defined by 
S det M = det Moo det N14 


It can be shown that the superdeterminant obeys the 
product rule, unlike the obvious generalization of 
the determinant to the super case. 


The Geometric Approach to 
Supermanifolds 


A manifold is a space locally modeled on the 
topological space R", where m is the dimension of 
the manifold. Thus, each point in a manifold has a 
neighborhood which is essentially a neighborhood in 
R”. The most geometrically intuitive approach to 
supermanifolds is to generalize this directly by 
modeling a space locally on an extension of R" to 
include anticommuting variables; the most straight- 
forward space with the required algebraic property 
is the superspace Rt" built from a Grassmann 
algebra, leading to a supermanifold of dimension 
(m,n). (The dimension of a supermanifold is a pair 
of integers, indicating the numbers of even and odd 
coordinates of each point.) 

The formal definition of a supermanifold will now 
be given in a manner very closely analogous to that 
of a classical manifold. 


Definition 3. Let M be a set. 


(i) An (m,n) open chart on M is a pair (U, 6) such 
that U is a subset of M and ó is an injective map 
of U into Ry”, with the image ¢(U) an open set 
in RQ”. 

(ii) An (m,n) atlas on M is a collection ((U,, $4)] of 
(m,n) charts on M such that the Ua cover M 
and, whenever Ua N Ug is not empty, the change 
of coordinate function $, o bz is supersmooth. 


An (m,n)-dimensional supermanifold is a set M 
together with a maximal (m,n) atlas on M. 

The space M is given a topology by defining U c M 
to be open if and only if, for each o such that U N Ua 
is not empty, the set d,(UMU,) is an open subset 
of Rs”. 


mn 


Examples of supermanifolds include R7" itself, and 
also supermanifolds constructed from the data of a 
vector bundle over a classical manifold in a manner 
which will now be described. If N is a classical 
m-dimensional real manifold and E is an n-dimensional 
vector bundle over N, then an (m,n)-dimensional 
supermanifold can be constructed in the following 
way: suppose that {( Va, Wa)} is an atlas of charts on N, 
so that each Va is an open subset of N and each Ya is 
an injective map of Va onto an open subset of R”, 
with v^, o 4," smooth. Suppose further that the Va are 
also local trivialization neighborhoods of the bundle E 
with transition functions gag: V, M Va — GL(n). 
Then we build the supermanifold M by patching 
together the sets e !(u4(V4) x RY”) in a consistent 
way. This leads to a supermanifold with coordinate 
change functions 


—1 1 1 
ba o d! (xj. 3]. 6-65) 


"e 
= Ce Eu es id 


where 


| mx... ei, 1 m 
(Xi, E ) = nay. 0 V5 e T adm 


n 
JT 1 m | ek 
sy” yj» (xi. om. y Xg le 
k=1 


(Here again we refer to the appendix for the way in 
which functions of even Grassmann variables, as 
opposed simply to real numbers, are handled.) 
Particular examples of this construction are the 
tangent bundle over N and bundles of spinors over 
N. It was actually shown by Batchelor that all real, 
supersmooth supermanifolds are of this form. 

A similar definition may be made of a complex 
supermanifold using a complex Grassmann algebra, 
with the coordinate transition functions required to 
be superanalytic. In this case, supermanifolds which 
are not related to vector bundles in the manner 
described above are possible, basically because 
partitions of unity do not exist in the analytic 
setting. An example is the twisted supertorus, which 
is built over the standard torus and has transition 
functions (z,¢) — (z+1,¢) and (z,¢c) — (z+a+t 
aC, C +a), extending the standard torus with transi- 
tion functions z —^ z+1,z— z +a. (Here a, o are, 
respectively, even and odd constants.) This super- 
manifold is an example of a super Riemann surface; 
such surfaces play an important role in the quanti- 
zation of the spinning string. 

As with classical manifolds, a natural class of 
functions can be defined on a supermanifold: 
a function f on an open subset U of the 


|11] 
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supermanifold M is said to be supersmooth if, for 
each a such that UM Ua is nonempty, the function 
foo, is supersmooth on ¢g(UMU,). In local 
coordinates supersmooth functions are such that 
2 TET Big ang CP Bo Rag Ea with 


each fau a smooth function. 


The Algebraic Approach to 
Supermanifolds 


In the algebraic approach to supermanifolds, it is the 
algebra of functions, rather than the manifold 
itself, which is extended to include anticommuting 
elements. In this approach an (m,n)-dimensional 
supermanifold is defined to be a pair (N, A), where 
N is an m-dimensional classical manifold and A is a 
sheaf of superalgebras over N with various proper- 
ties, described below. The statement that A is a 
sheaf of algebras over N means that corresponding 
to each open subset U of N there is an algebra A(U); 
also, if V C U, there is a “restriction map" pu,y 
mapping A(U) into A(V), and the various restriction 
maps obey certain consistency conditions. A parti- 
cular example of such a sheaf (with trivial odd part) 
is the sheaf A, of real-valued functions on N, with 
Ag(U) — C**(U), the set of real-valued smooth func- 
tions on U and py, y mapping a function in C*(U) 
to its restriction in C* (V). The defining property of 
the sheaf corresponding to an (m,n)-dimensional 
supermanifold is that there is a cover {U,} of N for 
which the algebras A(U,) have the form A(U,) S 
C*(Ua) SY A(R”), so that a typical element f of 
A(U,) may be expressed as f= >, f,£", where f, € 
C*(U,) and £!,...,£" are generators of A(R”). The 
notation here is chosen to emphasize the close 
correspondence with the algebra of smooth func- 
tions described at the end of the previous section. 
This makes it clear that, despite an apparent 
difference, the two approaches lead to essentially 
equivalent supermanifolds. 

The advantage of the algebraic approach is its 
mathematical elegance and economy — there is no 
need to introduce the auxiliary Grassmann algebra 
Rs in which coordinate functions take values — but 
from the point of view of physicists, the geometric 
point of view has two advantages: first, it is closer to 
the standard manifold picture and thus easier to 
grasp, and, second, it allows a wider class of 
supermanifolds, because Grassmann constants are 
allowed; for instance, the twisted  supertorus 
described above cannot be included in the algebraic 
approach without either introducing an auxiliary 
algebra or moving to the more difficult concept of a 
family of supermanifolds. 
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While there have been various attempts to develop 
infinite-dimensional supermanifolds, most of the 
constructions have been developed for very specific 
purposes, such as path integration and functional 
integration methods for theories with fermions. 
Even the question of defining a basic infinite- 
dimensional superalgebra with the necessary 
analytic properties, such as a Hilbert-Banach super- 
algebra, requires sophisticated procedures, so that 
the development of a theory of infinite-dimensional 
supermanifolds becomes extremely technical. 


Calculus on Supermanifolds 


Much of the calculus of functions on supermanifolds 
proceeds in simple analogy to that of classical 
manifolds, with addition sign factors occurring when- 
ever two odd quantities are transposed. For instance, a 
vector field on M may be described as a super- 
derivation of the algebra of supersmooth functions 
on M, that is, a linear mapping of this space obeying 
the super Leibnitz rule X fg— Xf g + (—1) Xl) f Xg. 
Standard examples of vector fields (defined locally) are 
coordinate derivatives 0/Ox' and 0/0€, defined by 
(0/Ox')f — Oif o p) and (9/08)f = Oj+m(f © H) with $ 
the coordinate function corresponding to the coordi- 
nates (x!,...,x7;£,..., £"). Equipped with this con- 
cept of vector field, much of differential calculus on 
manifolds can be directly generalized to supermani- 
folds in a relatively straightforward way. However, in 
the case of integration the situation is quite different. 
The standard approach to integration of anticommut- 
ing variables is the Berezin integral, which is a formal, 
algebraic integral that is not an antiderivative and has 
no measure-theoretic features. There are various 
reasons why such an integral is used: for instance, 
even the simple function £ of a single anticommuting 
variable has no antiderivative, while the topology on 
R$” does not allow open sets which discriminate in 
odd directions. Additionally, when changing variables 
on Ry” it is the superdeterminant of the Jacobian 
matrix which must be used. In the purely odd sector, 
differentials thus transform the “wrong” way. 

The Berezin integral of a function f of z anti- 
commuting variables is defined by 


ES (Y e) = fun [12] 


In other words, Berezin integration simply picks out 
the coefficient of the highest-order term, thus 
resembling differentiation more than integration in 
the classical sense. Nonetheless, the Berezin integral 
has very useful properties, in particular allowing 
direct analoges of Fourier transformations and 


integral kernel. Given that it is the algebra of 
functions, and the operators acting on these alge- 
bras, which is the key element in supergeometry, 
these are vital properties of the integral. 

The transformation rule under change of variable 
is the inverse of that which one expects. For 
instance, in the case of a single variable, if one 
makes the transformation £ > $ — a£ + 8 with a and 
B constants, a direct calculation shows that the 
integral is invariant provided that one sets d= a dó. 

Integration on R$" is essentially defined by 
combining classical integration for the even variables 
with Berezin integration for odd variables, giving 


/ dx d"£ (X fx... ane) 
e (V) p 


- / dx A xy) [13] 
V 


This also defines integration on supermanifolds, 
provided that we can find a rule for the change of 
variable. This, as indicated above, may be done by 
using the superdeterminant of the Jacobian matrix. 
Suppose that (y, 9) are a new set of coordinates on 
our supermanifold. Then an invariant definition of 
integral is obtained if we set 


ny YY 

mo qne Ox oE m n 

d yd'£ = Sdet üó a6 dadé [14] 
Ox oE 


Appendix 


We now describe the device which allows functions 
of even Grassmann variables to be handled simply as 
functions of conventional variables. The necessary 
class of functions is captured by defining super- 
smooth functions on RI as extensions by Taylor 
expansion from smooth functions on R”. 


Definition 4. The function F: RIS — Rs is said to 
be supersmooth if there exists a smooth function 
F:R" — R, such that 


[15] 


(Although this Taylor series will in general be 
infinite, it gives well-defined coefficients for each 


D, in the expansion [3], so that the value of F is a 
well-defined element of Rs.) A number of different 
classes of function can be obtained, by varying the 
space in which the function F takes its value. 


See also: Batalin-Vilkovisky Quantization; BRST 
Quantization; Graded Poisson Algebras; Path-Integrals in 
Non Commutative Geometry; Random Matrix Theory in 
Physics; Supergravity; Superstring Theories; 
Supersymmetric Particle Models; Supersymmetric 
Quantum Mechanics. 
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Introduction 


String theory postulates that all elementary particles 
in nature correspond to different vibration states of 
an underlying relativistic string. In the quantum 
theory both the frequencies and the amplitudes of 
vibration are quantized, so that the quantum states 
of a string are discrete. They can be characterized by 
their mass, spin, and various gauge charges. One of 
these states has zero mass and spin equal to 2h, and 
can be identified with the messenger of gravitational 
interactions, the graviton. Thus, string theory is a 
candidate for a unified theory of all fundamental 
interactions, including quantum gravity. 

In this article, we discuss the theory of superstrings 
as consistent theories of quantum gravity. The aim is 
to provide a quick (mostly lexicographic and biblio- 
graphic) entry to some of the salient features of the 
subject for a nonspecialist audience. Our treatment is 
thus neither complete nor comprehensive — there exist 
for this several excellent expert books, in particular 
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by Green, et al. (1987) and by Polchinski (1998). An 
introductory textbook by Zwiebach (2004) is also 
highly recommended for beginners. Several other 
complementary reviews on various aspects of super- 
string theories are available on the internet (see the 
“Further reading” section); some more will be given 
as we proceed. 


The Five Superstring Theories 


Theories of relativistic extended objects are tightly 
constrained by anomalies, that is, quantum viola- 
tions of classical symmetries. These arise because the 
classical trajectory of an extended p-dimensional 
object (or “p-brane”) is described by the embedding 
X"(C*), where (?^-?? parametrize the brane world 
volume, and X^-9-.P-! are coordinates of the 
target space. The quantum mechanics of a single 
p-brane is therefore a (p + 1)-dimensional quantum 
field theory, and as such suffers a priori from 
ultraviolet divergences and anomalies. The case 
p = 1 is special in that these problems can be exactly 
handled. The story for higher values of p is much 
more complicated, as will become apparent later on. 

The theory of ordinary loops in space is called 
closed bosonic string theory. The classical trajectory 
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of a bosonic string extremizes the Nambu-Goto 
action (proportional to the invariant area of the 
world sheet) 


SNG —— zz | € —det(G,,,O; Xop X") [1] 
2ra 


where G,,(X) is the target-space metric, and a’ is 
the Regge slope (which is inversely proportional to 
the string tension and has dimensions of length 
squared). In flat spacetime, and for a conformal 
choice of world-sheet parameters (^ — C? +¢', the 
equations of motion read: 


8,0 X" 2-0 and  5,0,X"0,X" —0 [2] 


with n, the Minkowski metric. The X" are thus free 
two-dimensional fields, subject to quadratic phase- 
space constraints known as the Virasoro conditions. 
These can be solved consistently at the quantum 
level in the critical dimension D — 26. Otherwise, 
the symmetries of eqns [2] are anomalous: either 
Lorentz invariance is broken, or there is a conformal 
anomaly leading to unitarity problems. (For D « 26, 
unitary noncritical string theories in highly curved 
rather than in the originally flat background can be 
constructed.) 

Even for D — 26, bosonic string theory is, how- 
ever, sick because its lowest-lying state is a tachyon, 
that is, it has negative mass squared. This follows 
from the zeroth-order Virasoro constraints, 


4 4 
m = -ppm =—(NiL-—1)=—(Nr-1) [3] 
Q Q 


where N¡ (Ng) is the sum of the frequencies of all 
left(right)-moving excitations on the string world 
sheet. The negative contribution to m* comes from 
quantum fluctuations, and is analogous to the well- 
known Casimir energy. The  tachyon has 
Ni — Ng =0. Its presence signals an instability of 
Minkowski spacetime, which in bosonic string 
theory is expected to decay, possibly to some 
lower-dimensional highly curved geometry. The 
details of how this happens are not, at present, 
well understood. 

The problem of the tachyon is circumvented by 
endowing the string with additional, anticommuting 
coordinates, and requiring spacetime supersymmetry. 
This is a symmetry that relates string states with 
integer spin, obeying Bose-Einstein statistics, to 
states with half-integer spin obeying Fermi-Dirac 
statistics. There exist two standard descriptions of the 
superstring: the Ramond-Neveu-Schwarz (RNS) 
formulation, where the anticommuting coordinates 
y” carry a spacetime vector index, and the Green- 
Schwarz (GS) formulation in which they transform as 
a spacetime spinor 0^. Each has its advantages and 


drawbacks: the RNS formulation is simpler from the 
world sheet point of view, but awkward for describ- 
ing spacetime fermionic states; in the GS formulation, 
on the other hand, spacetime supersymmetry is 
manifest but quantization can only be carried out in 
the restrictive light-cone gauge. A third formulation, 
possibly combining the advantages of the other two, 
has been proposed more recently by Berkovits (2002) — 
it is still being developed. 

Anomaly cancelation leads to five consistent super- 
string theories, all defined in D — 10 flat spacetime 
dimensions. They are referred to as type IIA, type IIB, 
heterotic SO(32), heterotic Eg x Eg, and type I. The 
two type II theories are given (in the RNS formula- 
tion) by a straightforward extension of eqns [2]: 


0,0. X"—O04vy,; 20 and mV.0.X'—0 [4] 
The left- and right-moving world sheet fermions can 
be separately periodic or antiperiodic — these are 
known as Ramond (R) and Neveu-Schwarz (NS) 
boundary conditions. Ramond fermions have zero 
modes obeying a Dirac y-matrix algebra, and which 
must thus be represented on spinor space. As a 
result, out of the four possible boundary conditions 
for 4% and vy", namely NS-NS, R-R, NS-R, or 
R-NS, the first two give rise to string states that are 
spacetime bosons, while the other two give rise to 
states that are spacetime fermions. Consistency of 
the theory further requires that one only keep states 
of definite world-sheet fermion parities — an opera- 
tion known as the Gliozzi-Scherk-Olive (GSO) 
projection. This operation removes the would-be 
tachyon, and acts as a chirality projection on the 
spinors. The type IIA and IIB theories differ only in 
that the spinors coming from the left and right 
Ramond sectors have the opposite chirality in type 
IIA and the same chirality in type IIB. 

The fact that string excitations split naturally into 
noninteracting left and right movers is crucial for 
the construction of the heterotic strings. The key 
idea is to put together the left-moving sector of the 
D —10 type II superstring and the right-moving 
sector of the D — 26 bosonic string. A subtlety arises 
because the left-right asymmetry may lead to extra 
anomalies, under global reparametrizations of the 
string world sheet. These are known as modular 
anomalies, and we will come back to them in the 
following section. Their cancelation imposes strin- 
gent constraints on the zero modes of the unmatched 
(chiral) bosons in the right-moving sector. The free- 
field expansion of these bosons can be written as: 


- T o! i inf 
X(C ) = Xn +a PrE e VY iue * B 
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where bold-face letters denote 16-component vec- 
tors. Modular invariance then requires that the 
generalized momentum Pp take its values in a 
sixteen-dimensional, even self-dual lattice. There 
exist two such lattices, and they are generated by 
the roots of the Lie groups Spin(32)/Z 2 and Eg x Eg. 
They give rise to the two consistent heterotic string 
theories. 

In contrast to the type II and heterotic theories, 
which are based on oriented closed strings, the type I 
theory has unoriented closed strings as well as open 
strings in its perturbative spectrum. The closed 
strings are the same as in type IIB, except that one 
only keeps those states that are invariant under 
orientation reversal ((* C`). Open strings must 
also be invariant under this flip, and can further- 
more carry pointlike (Chan-Paton) charges at their 
two endpoints. This is analogous to the flavor 
carried by quarks at the endpoints of the chromo- 
electric flux tubes in QCD. Ultraviolet finiteness 
requires that the Chan-Paton charges span a 
32-dimensional vector space, so that open strings 
transform in bifundamental symmetric or antisym- 
metric representations of SO(32). For a thorough 
review of type I string theory, see the reference 
Angelantonj and Sagnotti (2002, 2003). 


Interactions and Effective Theories 


Strings interact by splitting or by joining at a point, 
as is illustrated in Figure 1. This is a local 
interaction that respects the causality of the theory. 
To compute scattering amplitudes, one sums over all 
world sheets with a given set of asymptotic states, 
and weighs each local interaction with a factor of 
the string coupling constant A. The expansion in 
powers of A is analogous to the Feynman-diagram 
expansion of point-particle field theories. These 
latter are usually defined by a Lagrangian, or more 
exactly by a functional-integral measure, and they 
make sense both for off-shell quantities as well as at 
the nonperturbative level. In contrast, our current 
formulation of superstring theory is in terms of a 
perturbatively defined S-matrix. The advent of 
dualities has offered glimpses of an underlying 
nonperturbative structure called M-theory, but 


Be am 
Figure 1 A four-particle and a four-string interaction. 
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defining it precisely is one of the major outstanding 
problems in the subject. (One approach consists in 
trying to define a second-quantized string field 
theory; see String Field Theory). 

Another important expansion of string theory, 
very useful when it comes to extracting spacetime 
properties, is in terms of the characteristic string 
length l — Va’. At energy scales El, < 1, only a 
handful of massless string states propagate, and their 
interactions are governed by an effective low-energy 
Lagrangian. In the type II theories, the massless 
bosonic states (or rather their corresponding fields) 
consist of the metric G,,, a scalar field called the 
dilaton, and a collection of antisymmetric n-form 
fields coming from both the NS-NS and the R-R 
sectors. For type IIA, these latter are an NS-NS 
2-form B5, an R-R 1-form Ci, and an R-R 3-form 
C3. The leading-order action for these fields reads: 


1 . 
Stra - | ds lv —Ge ?* (R 4- 40,00" - Asp) 
= -GÜR -HIR — Cy A H3|) 


MAR AR [6] 


where F» =dC1, H; =dB2, and F4=dC3 are field 
strengths, the wedge denotes the exterior product of 
forms, and |F,]^ = (1/nl)F,, .,,, P. The dimen- 
sionful coupling k can be expressed in terms of the 
string-theory parameters, 2x? — (2)! A2o/ * A similar 
expression can be written for the IIB theory, whose 
R-R sector contains a O-form, a 2-form, and a 
4-form potential, the latter with self-dual field 
strength. 

The action [6], together with its fermionic part, 
defines the maximally supersymmetric nonchiral 
extension of Einstein's gravity in ten dimensions 
called type IIA supergravity (see Supergravity and 
Salam and Sezgin (1989)) The dilaton and all 
antisymmetric tensor fields belong to the super- 
multiplet of the graviton — they provide together the 
same number of (bosonic) states as a ten-dimensional 
nonchiral gravitino. Supersymmetry fixes further- 
more completely all two-derivative terms of the 
action, so that the theory defined by [6] is (almost) 
unique. (There exists in fact a massive extension of 
IIA supergravity, which is the low-energy limit of 
string theory with a nonvanishing R-R 10-form field 
strength.) It is, therefore, not surprising that it should 
emerge as the low-energy limit of the (nonchiral) 
superstring theory. The latter provides, however, an 
ultraviolet completion of an otherwise nonrenorma- 
lizable theory, a completion which is, at least 
perturbatively, finite and consistent. 
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The finiteness of string perturbation theory has 
been, strictly speaking, only established up to two 
loops — for a recent review see D'Hoker and Phong 
(2002). However, even though the technical pro- 
blem is open and hard, the qualitative case for all- 
order finiteness is convincing. It can be illustrated 
with the torus diagram which makes a one-loop 
contribution to string amplitudes. The thin torus of 
Figure 2 could be traced either by a short, light 
string propagating (virtually) for a long time, or by a 
long, heavy string propagating for a short period of 
time. In conventional field theory, these two virtual 
trajectories would have made distinct contributions 
to the amplitude, one in the infrared and the second 
in the ultraviolet region. In string theory, on the 
other hand, they are related by a modular transfor- 
mation (that exchanges (? with C!) and must not, 
therefore, be counted twice. A similar kind of 
argument shows that all potential divergences of 
string theory are infrared - they are therefore 
kinematical (i.e., occur for special values of the 
external momenta), or else they signal an instability 
of the vacuum and should cancel if one expands 
around a stable ground state. 

The low-energy limit of the heterotic and type I 
string theories is N—1 supergravity plus super 
Yang-Mills. In addition to the N=1 graviton 
multiplet, the massless spectrum now also includes 
gauge bosons and their associated gauginos. The 
two-derivative effective action in the heterotic case 
reads: 
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Simons gauge 3-form. Again, supersymmetry fixes 

completely the above action — the only freedom is in 

the choice of the gauge group and of the Yang-Mills 
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Figure 2 The same torus diagram viewed in two different 
channels. 


coupling £ym. Thus, up to redefinitions of the fields, 
the type I theory has necessarily the same low- 
energy limit. 

The D=10 supergravity plus super Yang—Mills 
has a hexagon diagram that gives rise to gauge and 
gravitational anomalies, similar to the triangle 
anomaly in D=4. It turns out that for the two 
special groups Eg x Eg and SO(32), the structure of 
these anomalies is such that they can be canceled by 
a combination of local counter-terms. One of them 
is of the form f B» ^ Xg(F, R), where Xg is an 8-form 
quartic in the curvature and/or Yang-Mills field 
strength. The other is already present in the lower 
line of expression [7], with the replacement 
WR 86 e — Lorenz, where the second Chern- 
Simons form is built out of the spin connection. 
Note that these modifications of the effective action 
involve terms with more than two derivatives, and 
are not required by supersymmetry at the classical 
level. The discovery by Green and Schwarz that 
string theory produces precisely these terms (from 
integrating out the massive string modes) was called 
the “first superstring revolution.” 


D-Branes 


A large window into the nonperturbative structure 
of string theory has been opened by the discovery of 
D(irichlet)-branes, and of strong/weak-coupling 
duality symmetries. A Dp brane is a solitonic 
p-dimensional excitation, defined indirectly by the 
property that open string endpoints can attach to its 
world volume (see Figure 3). Stable Dp branes exist 
in the type IIA and type IIB theories for p even, 
respectively, odd, and in the type I theory for p= 1 
and 5. They are charged under the R-R (p + 1)-form 
potential or, for p > 4, under its magnetic dual. 
Strictly speaking, only for 0 € p € 6 do D-branes 
resemble regular solitons the word stands for 
“solitary waves"). The D7 branes are more like 


Figure 3 D-branes and open strings. 


cosmic strings, the D8 branes are domain walls, 
while the D9 branes are spacetime filling. Indeed, 
type I string theory can be thought as arising from 
type IIB through the introduction of an orientifold 
9-plane (required for tadpole cancelation) and of 32 
D9 branes. 

The low-energy dynamics of a Dp brane is 
described by a supersymmetric abelian gauge theory, 
reduced from ten down to p-- 1 dimensions. The 
gauge field multiplet includes 9 — p real scalars, 
plus gauginos in the spinor representation of the 
R-symmetry group SO(9 — p). These are precisely 
the massless states of an open string with endpoints 
moving freely on a hyperplane. The real scalar fields 
are Goldstone modes of the broken translation 
invariance, that is, they are the transverse coordinate 
fields Y(€*) of the D-brane. The bosonic part of the 
low-energy effective action is the sum of a Dirac- 
Born-Infeld (DBI) and a Chern-Simons (CS) like 


term: 


lp = -Ty | Ptge? V —det(Gap + Fgh) 
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where Fap =Bap + 21o'F;,, hats denote pullbacks 
on the brane of bulk tensor fields (e.g., Gap = 
G,,0, Y"O,Y"), F,, is the field strength of the 
world-volume gauge field, and in the CS term 
one is instructed to keep the (p + 1)-form of the 
expression under the integration sign. The constants 
T, and pp are the tension and charge density of the 
D-brane. As was the case for the effective super- 
gravities, the above action receives curvature 
corrections that are higher order in the o/ expan- 
sion. Note however that a class of higher-order 
terms have been already resummed in expression 
[8]. These involve arbitrary powers of F,,, and are 
closely related more precisely T-dual, see later) to 
relativistic effects which can be important even in 
the weak-acceleration limit. When refereing to the 
D9 branes of the type I superstring, the action [8] 
includes the GS terms required to cancel the gauge 
anomaly. 

The tension and charge density of a Dp brane can 
be extracted from its coupling to the (closed-string) 
graviton and R-R (p + 1)-form, with the result: 
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The equality of tension and charge follows from 
unbroken supersymmetry, and is also known as a 
Bogomol'nyi-Prasad-Sommerfeld (BPS) condition. 
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It implies that two or more identical D-branes 
exert no net static force on each other, because 
their R-R repulsion cancels exactly their gravita- 
tional attraction. A nontrivial check of the result 
[9] comes from the Dirac quantization condition 
(generalized to extended objects by Nepomechie 
and Teitelboim). Indeed, a Dp brane and a 
D(6 — p)-brane are dual excitations, like electric 
and magnetic charges in four dimensions, so their 
couplings must obey 


2k pppg-p = 2rk where k € Z [10] 


This ensures that the Dirac singularity of the long- 
range R-R fields of the branes does not lead to an 
observable Bohm-Aharonov phase. The couplings 
[9] obey this condition with k — 1, so that D-branes 
carry the smallest allowed R-R charges in the 
theory. 

A simple but important observation is that open 
strings living on a collection of z identical D-branes 
have matrix-valued wave functions wj, where 
i,j=1,..., label the possible endpoints of the 
string. The low-energy dynamics of the branes is 
thus described by a nonabelian gauge theory, with 
group U(z) if the open strings are oriented, and 
SO(n) or Sp(z) if they are not. We have already 
encountered such Chan-Paton factors in our discus- 
sion of the type I superstring. More generally, this 
simple property of D-branes has led to many insights 
on the geometric interpretation and engineering of 
gauge theories, which are reviewed in the articles 
Brane Construction of Gauge Theories and Gauge 
Theories from Strings. It has also placed on a firmer 
footing the idea of a brane world, according to 
which the fields and interactions of the standard 
model would be confined to a set of D-branes, while 
gravitons are free to propagate in the bulk (for 
reviews, see Brane Worlds and reference Lust 
(2004)). It has, finally, inspired the gauge/string 
theory or AdS/CFT correspondence (see Ads/CFT 
Correspondence and Aharony et al. (2000)) on 
which we will comment later. 


Dualities and M Theory 


One other key role of D-branes has been to provide 
evidence for the various nonperturbative duality 
conjectures. Dual descriptions of the same physics 
arise also in conventional field theory. A prime 
example is the Montonen-Olive duality of four- 
dimensional, N=4 supersymmetric Yang-Mills, 
which is the low-energy theory describing the 
dynamics of a collection of D3 branes. The action 
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for the gauge field and six associated scalars 4 (all in 
the adjoint representations of the gauge group G) is 
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Consider for simplicity the case G=SU(2). The 
scalar potential has flat directions along which the 
six 9! commute. By an SO(6) R-symmetry rotation, 
we can set all but one of them to zero, and let 
«tr(6! d!) =v? in the vacuum. In this “Coulomb 
phase" of the theory, a U(1) gauge multiplet stays 
massless, while the charged states become massive 
by the Higgs effect. The theory admits furthermore 
smooth magnetic-monopole and dyon solutions, and 
there is an elegant formula for their mass: 
471 


where T= ha + g [12] 


M = v|ng + TAmgl, = 


and 2¿¡(Mmg) denotes the quantized electric (mag- 
netic) charge. This is a BPS formula that receives 
no quantum corrections. It exhibits the SL(2, Z) 
covariance of the theory, 
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Here a,b,c,d are integers subject to the condition 
ad — bc — 1. Of special importance is the transfor- 
mation 7— —1/rz, which exchanges electric and 
magnetic charges and (at least for 0 — 0) the strong- 
with the weak-coupling regimes. For more details 
see the review by Harvey (1996). 

The extension of these ideas to string theory can be 
illustrated with the strong/weak- coupling duality 
between the type I theory, and the Spin(32)/Z; 
heterotic string. Both have the same massless spec- 
trum and low-energy action, whose form is dictated 
entirely by supersymmetry. The only difference lies in 
the relations between the string and supergravity 
parameters. Eliminating the latter, one finds 

Regi x and 
It is thus tempting to conjecture that the strongly 
coupled type I theory has a dual description as a 


het P V2A104 [14] 


weakly coupled heterotic string. These are, indeed, 
the only known ultraviolet completions of the 
theory [7]. Furthermore, for A; >> 1, the D1 brane 
of the type I theory becomes light, and could be 
plausibly identified with the heterotic string. This 
conjecture has been tested successfully by comparing 
various supersymmetry-protected quantities (such as 
the tensions of BPS excitations and special higher- 
derivative terms in the effective action), which can be 
calculated exactly either semiclassically, or at a given 
order in the perturbative expansion. Testing the duality 
for nonprotected quantities is a hard and important 
problem, which looks currently out of reach. 

The other three string theories have also well- 
motivated dual descriptions at strong coupling 4. 
The type IIB theory is believed to have an SL(2, Z) 
symmetry, similar to that of the N=4 super Yang- 
Mills. (Note that A is a dynamical parameter, that 
changes with the vacuum expectation value of the 
dilaton <>. Thus, dualities are discrete gauge 
symmetries of string theory.) The type IIA theory 
has a more surprising strong-coupling limit: it grows 
one extra dimension (of radius Rj; = 1/Av/a/), and 
can be approximated at low energy by the maximal 
11-dimensional supergravity of Cremmer, Julia, and 
Scherk. The latter is a very economical theory — its 
massless bosonic fields are only the graviton and a 
3-form potential A3. The bosonic part of the action 
reads 
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The electric and magnetic charges of the 3-form are a 
(fundamental?) membrane and a solitonic 5-brane. 
Standard Kaluza-Klein reduction on a circle maps $11p 
to the IIA supergravity action [6], where Gv, @, and Cj 
descend from the 11-dimensional graviton, and B and 
C3 from the 3-form A3. Furthermore, all BPS excita- 
tions of the type IIA string theory have a counterpart in 
11 dimensions, as summarized in Table 1. Finally, if 
one compactifies the eleventh dimension on an interval 
(rather than a circle), one finds the conjectured strong- 
coupling limit of the Eg x Eg heterotic string. 

The web of duality relations can be extended by 
compactifying further to D < 9 dimensions. Readers 
interested in more details should consult Polchinski 
(1998) or one of the many existing reviews of the 
subject (Townsend (1996), see also “Further Read- 
ing” section). In nine dimensions, in particular, the 
two type II theories, as well as the two heterotic 
superstrings, are pairwise T-dual. T-duality is a 
perturbative symmetry (thus firmly established, not 
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Table 1 BPS excitations of type IIA string theory, and their counterparts in M theory compactified on a circle of radius H4 


Tension Type IIA 
(./7/K10) (2r Va) DO brane 
Tr = (2xa/) | String 
(/m/K10)(2rV 0’) D2 brane 
(VTI Korva) D4 brane 
(m/ripM27a”) NS-5-brane 
(YT/ro Nerva)? D6 brane 


M on S! Tension 


K-K excitation 1/H 
2r Ry (272 /5%, jJ 
T? = (22*/45,) ^ 


Fi (222/52, ) ^ 


Wrapped membrane 
Membrane 

Wrapped 5-brane 
5-brane (1/22) (22/12, 


K-K monopole 2n? Re, / Ke, 


From Bachas CP (1997) Lectures on D-branes. In: Olive DI and West PC (eds.) Duality and Supersymmetric Theories, Proceedings, 
Easter School, Newton Institute, Euroconference, Cambridge, UK, April 7-18. With permission of Cambridge University Press. 
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Figure 4 Web of dualities in nine dimensions. From Bachas CP 
(1997) Lectures on D-branes. In: Olive DI and West PC (eds.) 
Duality and Supersymmetric Theories, Proceedings, Easter School, 
Newton Institute, Euroconference, Cambridge, UK, April 7—18. With 
permission of Cambridge University Press. 


only conjectured) which exchanges momentum and 
winding modes. Putting together all the links one 
arrives at the fully connected web of Figure 4. This 
makes the point that all five consistent superstrings, 
and also 11-dimensional supergravity, are limits of a 
unique underlying structure called M theory. (For 
lack of a better definition, “M” is sometimes also 
used to denote the D=11 supergravity plus 
supermembranes, as in Figure 4.) A background- 
independent definition of M theory has remained 
elusive. Attempts to define it as a matrix model of 
DO branes, or by quantizing a fundamental mem- 
brane, proved interesting but incomplete. A diffi- 
culty stems from the fact that in a generic 
background, or in D=11 Minkowski spacetime, 
there is only a dimensionful parameter fixing the 
scale at which the theory becomes strongly coupled. 


Other Developments and Outlook 


We have not discussed in this brief review some 
important developments covered in other contribu- 
tions to the encyclopedia. For the reader’s conve- 
nience, and for completeness, we enumerate (some 
of) them giving the appropriate cross-references: 
Compactification. To make contact with the 
standard model of particle physics, one has to 


compactify string theory on a six-dimensional 
manifold. There is an embarassment of riches, 
but no completely realistic vacuum and, more 
significantly, no guiding dynamical principle to 
help us decide (see Compactification of Superstring 
Theory). The controlled (and phenomenologically 
required) breaking of spacetime supersymmetry is 
also a problem. 

Conformal field theory and quantum geometry. 
The algebraic tools of 2D conformal field theory, 
both bulk and boundary (see Two-Dimensional 
Conformal Field Theory and Vertex Operator 
Algebras), play an important role in string theory. 
They allow, in certain cases, a resummation of a’ 
effects, thereby probing the regime where classical 
geometric notions do not apply. 

Microscopic models of black holes. Charged extre- 
mal black holes can be modeled in string theory by BPS 
configurations of D-branes. This has led to the first 
microscopic derivation of the Bekenstein-Hawking 
entropy formula, a result expected from any consistent 
theory of quantum gravity. As with the tests of duality, 
the extension of these results to neutral black holes is a 
difficult open problem — see Branes and Black Hole 
Statistical Mechanics. 

AdS/CFT and holography. A new type of (holo- 
graphic) duality is the one that relates supersym- 
metric gauge theories in four dimensions to string 
theory in asymptotically anti-de Sitter spacetimes. 
The sharpest and best-tested version of this duality 
relates N=4 super Yang-Mills to string theory in 
AdS; x S5. Solving the o-model in this latter back- 
ground is one of the keys to further progress in the 
subject (see AdS/CFT Correspondence). 

String phenomenology. Finding an experimental 
confirmation of string theory is clearly one of the most 
pressing outstanding questions. There exist several 
interesting possibilities for this — cosmic strings, large 
extra dimensions, modifications of gravity, primordial 
cosmology (see String Theory: Phenomenology for a 
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Figure 5 The unification of couplings. 


review). Here we point out the one supporting piece 
of experimental evidence: the unification of the 
gauge couplings of the (supersymmetric, minimal) 
standard model at a scale close to, but below the 
Planck scale, as illustrated in Figure 5. This is a 
generic “prediction” of string theory, especially in its 
heterotic version. 


See also: AdS/CFT Correspondence; Boundary 
Conformal Field Theory; Brane Construction of Gauge 
Theories; Brane Worlds; Branes and Black Hole 
Statistical Mechanics; Compactification of Superstring 
Theory; Derived Categories; Electroweak Theory; 
Gauge Theories from Strings; Noncommutative 
Geometry from Strings; Supermanifolds; String Field 
Theory; String Theory: Phenomenology; Supergravity; 
Two-Dimensional Conformal Field Theory and Vertex 
Operator Algebras; Wheeler—DeWitt Theory. 
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Introduction 


Supersymmetric quantum field theories (see Super- 
gravity) are characterized by the existence of one 
(N —1 supersymmetry) or several (N > 1 extended 
supersymmetry) conserved Noether-like charges 
Or A=1,...,N, which establish symmetry links 
between particle states of different spin. Super- 
symmetry ensures equal numbers of bosonic and 
fermionic particle states. If it is exact, bosons and 
fermions related by supersymmetry transformations 
have equal masses. Moreover, supersymmetry 
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imposes stringent relations between interactions 
which involve particles of different spin. This gives 
rise to a special ultraviolet behavior of supersym- 
metric theories. Their ultraviolet divergences are 
much softer than in nonsupersymmetric theories. In 
particular, N —4 supersymmetric quantum field 
theories are finite and for any N they are free from 
quadratic divergences plaguing ordinary theories 
with elementary scalars. N » 4 supersymmetric 
theories necessarily involve particles of spin higher 
than 1 and are not renormalizable. Supersymmetry 
promoted to a local symmetry includes gravity. 
Only N=1 supersymmetric theories allow for 
chiral fermions which are the fundamental objects in 
elementary particle interactions (see Standard Model 
of Particle Physics). This is because parity and 


charge conjugation symmetries are violated in weak 
interactions. Therefore, N > 1 theories may not be 
of immediate phenomenological relevance. How- 
ever, they may be useful for constructing super- 
symmetric theories in more than four dimensions 
(more than three spatial dimensions). Chiral (effec- 
tive) theory in four dimensions can be then obtained 
after compactification of extra dimensions. For 
instance, N=2 theory in five dimensions (x,y) 
compactified on a circle with reflection symmetry 
y —y (orbifold compactification) gives chiral 
N — 1 theory in four dimensions. 

Absence of quadratic divergences in supersym- 
metric theories is the main argument supporting the 
belief that fundamental interactions of elementary 
particles at energies not higher that O(1 TeV) should 
be described by an (approximately) N=1 super- 
symmetric extension of the standard model (SM). 
Indeed, supersymmetric models elegantly solve the 
so-called hierarchy problem of the SM. At present, 
supersymmetry remains a theoretical hypothesis. 
No experimental evidence for it has been found yet 
(for experimental lower bounds on the masses of 
supersymmetric particles see Eidelman et al. (2004)). 
Supersymmetric models will be tested experimentally 
at the Large Linear Collider at CERN (Geneva), after 
the completion of its construction in 2007. Super- 
gravity theories may be physically relevant as an 
intermediate step in constructing phenomenologically 
viable models from superstring theories. 

The essence of the hierarchy problem of the 
standard model (SM) — the successful SU(3), x 
SU(2), x U(1)y gauge theory of interactions of 
quarks and leptons at energies up to about 100 GeV — 
is the following. By itself, the SM does not explain the 
value of the Fermi scale v of the electroweak 
SU(2), x U(1)y symmetry breaking (v ~ Gu where 
Gy is the Fermi constant determined by the life time 
of the muon). Indeed, in the SM, the electroweak 
symmetry breaking is realized by an elementary Higgs 
field H (an SU(2) doublet) with a potential 


V=m? H'H +2 (HH) [1] 


where m and A are free parameters of the SM. When 
m? <0 is chosen, the minimum of the potential 
occurs when 


m? y? 


A 2 
that is, the Higgs doublet acquires SU(2) x U(1)y 
breaking vacuum expectation value v which is just 
the Fermi scale. The masses of the intermediate 
vector bosons W* and Z? are proportional to v and 
depend also on the gauge couplings. Within the SM 


(H'H) = — [2] 
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understood as a theory with the momentum cut-off 
Asm, quantum corrections to the mass parameter m? 
in eqn [1] are quadratically divergent: 


3 
bm? = —— (3g +gi +A- 8y )Asm t B] 
64r 


Here, 21,22, and y; are the gauge couplings of the 
groups U(1)y, SU(2);, and the top-quark Yukawa 
coupling, respectively. This means that if, above the 
energy scale Asm, the SM is replaced by some more 
fundamental theory, in which there are particles of 
masses M > Asy, the quantum corrections to m? are 
quadratically dependent on the new mass scale M. 
For M > v, this is very unnatural even if the original 
parameter m? remains a free parameter of this 
underlying theory and particularly difficult to accept 
if in the underlying theory m? is fixed by some more 
fundamental considerations. If the SM was the 
correct theory up to, for example, the mass scale 
suggested by the see-saw mechanism for the neu- 
trino masses, Asm ~ 10!° GeV 


lóm? | ~ 1078 GeV? ~ 107417! 


Clearly, this excludes the possibility of understand- 
ing the magnitude of the Fermi scale v in any 
sensible way. Thus, for naturalness of the Higgs 
mechanism in the SM there should exist a new mass 
scale M Z v, say only one order of magnitude higher 
than v and the theory describing the physics above 
that scale should be free of quadratic divergences. 
(Approximate) supersymmetry is at present the most 
elegant and theoretically most complete solution to 
the hierarchy problem of the SM. 


Supersymmetric Extensions of the SM 


In supersymmetry, the gauge fields A7 are promoted 
to vector superfields V^ = (A7, àf, D^), one for each 
gauge symmetry group generator, where As are 
Weyl fermions (called gauginos) and D“s are 
nondynamical auxiliary fields. A renormalizable 
supersymmetric gauge theory is completely defined 
(see, e.g., Sohnius (1985) and Wess and Bagger 
(1992)) by specifying the gauge group, the set of 
chiral supermultiplets P;=(6;, pi, F;) representing 
matter fields, and the superpotential — a holo- 
morphic polynomial function of at most third 
order in the chiral superfields which determines 
Yukawa couplings of the fermions v; and scalars ¢j. 
Auxiliary fields D^ and F; can be eliminated via their 
(algebraic) equations of motion. 

The so-called minimal supersymmetric SM 
(MSSM) encodes the main features of any super- 
symmetric extension of the SM. Its gauge group is 
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SU(3) x SU(2) x U(1) - the same as in the SM — and 
the chiral superfields are associated to each of the 
SM quark and lepton fields. Thus, quarks and 
leptons get scalar spin zero superpartners, the 
squarks and sleptons, carrying the same quantum 
numbers as their corresponding fermions and the 
vector superfields provide spin 1/2 superpartners for 
the gauge fields — the gluinos, the winos, and the 
bino. The SM Higgs doublet with weak hypercharge 
Y —1/2 becomes a scalar component of a chiral 
superfield H, which contains in addition one 
doublet of Weyl fermions - the Higgsinos. The 
chiral anomaly cancelation condition requires that 
there be also a second Higgs chiral superfield H, 
with Y — —1/2. Such a superfield is also required for 
giving masses to all flavors of quarks; because of the 
holomorphicity of the superpotential the same Higgs 
doublet cannot couple simultaneously to all quarks. 

With the MSSM superfield content, the most 
general renormalizable superpotential consistent 
with the gauge symmetry has the form 


W = Xu U°OH, E Y,D°OH, T YÊ LA, T HH, 
+A] DOL + MELL +3 Ge DEDE + A4LH, [4] 


(flavor indices are suppressed) where the superfield O 
contains the SU(2) quark doublet O and its scalar 
superpartner O and similarly for the lepton doublet 
L, quark singlets U, D, and lepton singlet E super- 
fields. The three first terms in [4] give the SM-like 
Yukawa couplings of quarks and leptons to the Higgs 
fields together with Yukawa couplings of the corre- 
sponding superpartners. The fourth term has no SM 
analogy; it gives supersymmetric masses to the Higgs 
scalar and Higgsinos. The interactions in the second 
line do not conserve baryon and lepton numbers, 
respectively B and L, and should be forbidden (or 
strongly suppressed) by some additional symmetry of 
the theory as they would lead to rapid proton decay. A 
discrete symmetry, called R-parity R —(— 1)? (LJ. 
where S is the spin of the field, is an interesting 
possibility. R-parity acts differently on the different 
components of the superfields: it is even for all SM 
particles and odd for their superpartners. Its conserva- 
tion implies that superpartners must appear in pairs in 
any interaction vertex. Thus, with R-parity imposed, 
the lightest supersymmetric particle is stable and it is an 
excellent candidate for the dark matter in the universe. 

Supersymmetry cannot be an exact symmetry of 
nature because there do not exist elementary fermions 
and bosons degenerate in mass. The superpotential 
[4] does not break supersymmetry spontaneously but 
even if it did the elementary fermions and bosons 
would on average have equal masses (they would 
satisfy some mass sum rule) which is also 


contradicted by the experimental data. Therefore, in 
the MSSM, supersymmetry has to be broken expli- 
citly but in such a way that the soft ultraviolet 
behavior remains intact. Remarkably, the super- 
symmetry breaking terms which can be added to the 
MSSM Lagrangian without reintroducing quadratic 
divergences make heavy just those fields which are 
Opposite statistics superpartners of the SM gauge 
bosons and fermions. These so-called soft terms are: 


~ ~ 


Lon — 1666 — LWW" w — LBBB 
mo lO? — m ÜP — mb BP 
-mi |L — mi|E'l — mi, JH" 
— my, |H^| — ma (PMH? 4- c.c.) 


+ AyU°OH, + ApD‘QHg+ArE‘LH, [5] 


and yield gaugino (gluino G, wino W, and bino B) 
and scalar mass terms as well as explicit trilinear 
couplings between scalars (scalar mass terms and 
A-terms are 3 x 3 matrices in the flavor space). As a 
result, supersymmetry is broken in the mass spectra 
but not in the dimensionless couplings. 

The origin of the soft supersymmetry breaking 
remains an open issue. Terms [5] are most probably 
remnants of the spontaneous supersymmetry break- 
ing in the so-called “hidden” sector — a hypothetical 
set of fields that do not interact directly with the 
MSSM fields. For example, in the popular scenario, 
they interact with the MSSM fields only gravitation- 
ally and spontaneous supersymmetry breaking in the 
hidden sector is communicated to the MSSM sector 
by gravitational interactions giving rise to terms [5]. 
Several other mechanisms of supersymmetry break- 
ing transmission have also been proposed (gauge 
mediation, anomaly mediation, etc.). 

The mass parameters and A-terms in [5] are free 
parameters of the low-energy supersymmetric theory 
and, combined with the interactions like OOG 
originating from supersymmetric kinetic terms, may 
be a new, troublesome, source of flavor changing 
neutral currents and of CP violation. 


Higgs Sector of the MSSM 
The MSSM Higgs potential reads 


V =m?|H,|* + mH, + m3 (H,,Hg + c.c.) 


21 +8) 2 yy? 
415175 (Ha? Hal?) 6] 
Its quartic part is uniquely determined by the 
structure of the supersymmetric gauge theory. The 
2 2 9 3 
parameters mi, m5, and m3 are determined by 


the soft supersymmetry breaking Higgs boson 
masses [5] and the u parameter in [4]. The potential 
[6] is bounded from below for mi + m3 > 2m4, and 
for mim — m4 < 0 it has the electroweak symmetry 
breaking minimum at v, = (H2) 4 0,v, = (H$) z 0. 
The ratio v,/vg = tan B is then phenomenologically 
a very important parameter. 

Quantum corrections to the mass parameters in 
[6] are controlled by the mass scale Mot of the 
supersymmetry breaking terms [5]; at the one-loop 
level instead of [3], one finds 


M 

óm; ~ - ETC (383 T gi- 12y5, s M In Mi. [7] 

sort 
where y, and y, are the bottom- and top-quark 
Yukawa couplings, respectively and Angew is the scale 
at which the soft supersymmetry breaking terms are 
generated by the putative supersymmetry breaking 
transmission mechanism. In gravity mediation scenar- 
ios, Ángw ~ Mp). In gauge mediation scenarios, Ánew 
is low but it is a new scale, introduced by hand. 

In the softly broken supersymmetric models, the 
hierarchy problem is solved for Mot <O(10)v. 
Moreover, eqn [7] shows that via quantum correc- 
tions the large top-quark Yukawa coupling y, drives 
the mass parameter m3 to a negative value, inducing 
the electroweak symmetry breaking. This means that 
in supersymmetric models the electroweak scale is 
calculable in terms of the known coupling constants 
and the (unknown) scales M.ot and cutoff scale 
Anew to the MSSM. If Maot <O(10)v, the correct 
electroweak scale is obtained for Awpw-^ Mur. 
This nicely fits with unification of the gauge 
couplings. 

In supersymmetric models, the quartic couplings 
in the Higgs potential are restricted. This typically 
leads to a strong upper bound on the mass of the 
lightest Higgs particle. In the minimal model with 
the potential [6], at the tree level 


M Higgs < Mz =91 GeV [8] 


This bound is substantially modified by quantum 
corrections. They depend quadratically on the top- 
quark mass and logarithmically on the stop mass 
scale M; ~ M;,g: 


Me... Ap" [9] 
where A is given by 
= l(g$ + g1) cos” 28 + AA 
3g) mj : 
with AA = -2% In — [10] 
87? AM m? 
For M; < 1 TeV, Mhiggs < 130 GeV. 
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The minimal-model bound on the Higgs mass can 
be relaxed in models with extended Higgs sector. 
For instance, if an additional gauge group singlet 
chiral superfield couples to the Higgs doublets, the 
Higgs self-coupling A in [9] receives additional 
contributions. Explicit calculations show that in 
such and other models, with M,,4 <1TeV, the 
bound on the Higgs mass cannot be raised above 
7150 GeV if one wants to preserve perturbative 
gauge coupling unification. 


Supersymmetric Grand Unified Theories 


There are two striking aspects of the matter 
spectrum in the SM. One is the chiral anomalies 
cancelation (Weinberg 1996-2000, Pokorski 2000), 
which is necessary for a unitary (and renormaliz- 
able) theory, and occurs thanks to certain conspiracy 
between quarks and leptons suggesting a deeper link 
between them. The second one is that the spectrum | 
fits into simple representations of the SU(5) and 
SO(10) groups (Ross 1985). Indeed, each generation 
of the SM matter fills 5* + 10 + 1 (if the right-handed 
neutrino is included into the spectrum) representations 
of SU(5) and for SO(10), 16=5*+10+ 1. The 
assignment of fermions to the SU(5) or SO(10) 
representations fixes the normalization of the U(1)y 
generator. Both facts suggest unification of strong and 
electroweak elementary forces in a grand unified 
theory with some bigger gauge symmetry group. Such 
unification implies that all the SM gauge forces 
become of equal strength at some unification scale. 
Their strength is measured by the running gauge 
couplings oj; — g2?/4z, i=1,2,3, of the three group 
factors SU(3), x SU(2), x U(1)y. The energy scale 
dependence of o; is governed by the renormalization 
group equations. In the first nontrivial approximation, 
they read: 


ji. 1 Aum 
o;(Q) 7 a;(Mz) 2T in A 11] 


Here, 1/o;(Mz) =(58.98 + 0.04, 29.57 + 0.03, 
8.40 + 0.14) are the experimental ei of the 
gauge couplings at the Fermi scale and bi are the 
coefficients which depend on the matter soe: of 
the theory. They are 


bo = hot + $N,, 
in the SM and 


-8 + 4N,,—11 + 4N,) 


bo = (+ 2N4, —5 + 2Ng, —9 + 2Ng) 


in the MSSM, where N, is the number of fermion 
generations. In the SM, the running gauge couplings 
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approach each other at high scale of order 10!’ GeV 
but never unify. 

In the MSSM, with sparticle spectrum character- 
ized by M,,g = 1 TeV and for the initial Fermi scale 
values given above, the three gauge couplings unify 
with high precision at the scale Maur ~ 10/6 GeV. 
Therefore, the MSSM can be embedded into super- 
symmetric grand unified theories with no hierarchy 
problem for the Fermi scale (it is stable with respect 
to radiative corrections generated by particles with 
masses —Mgur) and no conflict with the measured 
values of the gauge couplings. 

In the SM, the baryon number is (perturbatively) 
conserved since there are no renormalizable couplings 
violating this symmetry. Experimental search for 
proton decay, for example, p — e*a?, p — K*v, is 
one of the most fundamental tests for particle physics. 
The present limit on the proton life time is mp > 
10% yr. In grand unified theories, baryon number 
conservation is violated by interactions mediated by 
the heavy gauge bosons corresponding to the enlarged 
gauge symmetry (e.g., SU(5)), spontaneously broken at 
Mgur to the SM gauge symmetry. Such interactions 
manifest themselves at low energy as additional, 
nonrenormalizable interactions added to the SM 
Lagrangian. Proton decay is then induced by the set 
of dimension-6 operators of the form 


(6) 


(6 €i 
of =i 


;- qqdl [12] 
(6) 


where g,/ denote quarks and leptons, respectively. 
For c ~ acur z 1/25, the experimental limit on 
Tp requires M(g Z 10P GeV, consistently with 
Mgur = 10/6 GeV in supersymmetric GUTs. How- 
ever, in supersymmetric GUTS, there is still another, 
genuinely supersymmetric, source of contributions 
to the proton decay amplitudes. These are the 
dimension-5 operators 


(5) 
5 Ci T 
Op’ => ail [13] 
(S) 


where q,/ denote squarks and sleptons, respectively. 
Such operators originate from the exchange of the 
color triplet scalars present in the Higgs boson GUT 
multiplets, with Mis) ^v Mur ~ 1016 GeV, and 
c)>107 is given by the Yukawa couplings. 
Inserted into diagrams with gaugino exchanges they 
give rise to dimension-6 operators of the form [12]. 
One then gets c = agure), M; = Mis) Msusv. 
Given various uncertainties, for example, in the 
unknown squark, gaugino, and heavy Higgs boson 
mass spectrum, such contributions in supersym- 
metric GUT models predict the proton life time to 


be consistent with but close to the present experi- 
mental limits. 


Summary 


Supersymmetry is distinct in several very important 
points from all other proposed solutions to the 
hierarchy problem. First of all, it provides a general 
theoretical framework which allows one to address 
many physical questions. Supersymmetric models, 
like the MSSM or its simple extensions, satisfy a 
very important criterion of *perturbative calculabil- 
ity." In particular, they are easily consistent with 
the precision electroweak data. The SM is their 
low-energy approximation in the sense of the 
Appelquist-Carazzone decoupling, so most of the 
successful structure of the SM is built into super- 
symmetric models. The quadratically divergent quan- 
tum corrections to the Higgs mass parameter (the 
origin of the hierarchy problem in the SM) are absent 
in any order of perturbation theory. Therefore, the 
cutoff to a supersymmetric theory can be as high as 
the Planck scale, and “small” scale of the electroweak 
breaking is still natural. Supersymmetry is not only 
consistent with grand unification of elementary forces 
but, in fact, makes it very successful. And, finally, 
supersymmetry is needed for string theory. 
However, there are also some problems to be solved: 
the hierarchy problem of the electroweak scale is solved 
but the origin of the soft supersymmetry breaking scale 
Msofr remains an open question: spontaneous super- 
symmetry breaking and its transmission to the visible 
sector is a difficult problem and a fully satisfactory 
mechanism which would yield M, hierarchically 
smaller than the Planck (string) scale has not yet been 
found. On the phenomenological side, there are new 
potential sources of flavor-changing neutral current 
transitions and of CP violation, and baryon and lepton 
numbers are not automatically conserved by the 
renormalizable couplings. But even those problems 
can at least be discussed in a concrete quantitative way. 


See also: Brane Construction of Gauge Theories; 
Perturbation Theory and its Techniques; Seiberg-Witten 
Theory; Standard Model of Particle Physics; 
Supergravity; Supermanifolds. 
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Introduction 


Supersymmetric quantum mechanics is a specific 
extension of quantum mechanics with fermionic 
degrees of freedom. In quantum field theory and 
many-body theory, a fermionic degree of freedom is 
one which is subject to Pauli’s principle: any 
nondegenerate quantum state associated with a 
fermionic degree of freedom can be occupied at 
most once at any time. Similarly, in quantum 
mechanics, one associates a fermionic degree of 
freedom with an observable, the eigenvalue spec- 
trum of which is restricted to the discrete set (0, 1). 

The simplest example of a purely fermionic 
quantum system is the fermionic oscillator. It is 
represented by conjugate operators (f, f^) such that 


fao, fao Piff=1 u 


with a Hamiltonian H given by the bilinear 
expression 


H; = ep + buf 'f [2] 


The state space of this system is spanned by two 
independent state vectors |0) and |1), such that 


f[) —0, — f'|0) =/1) 
fit) =10), f'|1)-0 


By construction, the states |n;) are eigenstates of 
fermion number, 


Ny = f'f, 


with eigenvalue ny = (0, 1); this implements the Pauli 
principle. The states have energy eigenvalues 


ny = (0, 1) [5] 


i3] 


N? = Ny [4] 


En, Ef + ny hw, 


differing in energy by AE = hw. Physically, the system 
can be identified with a single fixed magnetic dipole in 


an external magnetic field, the only polarization states 
of the dipole being spin up or spin down. 

In the Schródinger representation of quantum 
mechanics (wave mechanics), fermionic degrees of 
freedom are represented by anticommuting Grassmann 
variables. These have no immediate classical analog, 
but can be used to construct quasiclassical obser- 
vables like spin. 

A supersymmetric quantum system is a system 
possessing both fermionic and bosonic degrees of 
freedom, characterized by a degeneracy between 
states with even and odd fermion number. In the 
Schrodinger representation, this is manifest in a 
symmetry transforming bosonic (Grassmann-even) 
into fermionic (Grassmann-odd) variables. The 
generators of the supersymmetry transformations 
square to the Hamiltonian of the system. 


The Supersymmetric Oscillator 


An elementary example of a supersymmetric quan- 
tum system is the supersymmetric oscillator. It is a 
physical system combining a standard bosonic 
quantum oscillator with a fermionic oscillator of 
the same frequency. The ordinary harmonic oscilla- 
tor is described by the pair of lowering and raising 
operators (b, b'), with commutator 


bb! — bib 1 [6] 
and the Hamiltonian 
H, = €p + bwb!b [7] 


In this case, the eigenvalue spectrum of the occupa- 
tion number 


N, = b'b [8] 


consists of all non-negative integers n, =0,1,2,..., 
with corresponding energy eigenvalues. To construct 
the supersymmetric oscillator, the harmonic oscilla- 
tor is combined with a fermionic oscillator [2] of the 
same frequency: 


H, = eo + bu(b!b + f'f) [9] 
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where £o =€, + ey. The ground state of this system is 
the state annihilated by both b and f: 


b|0,0) = f|0,0) = 0 110] 


The full set of energy eigenstates of the system is 
constructed by taking 


1 
ny, nf) — ——b^fi"r|o, 0 
np, nr) "ji f'™ 10,0) " 


Hip = (0, 1,2, ues ny = (0,1) 


with the energy eigenvalue spectrum 


E(n», nr) = E0 + nhw, n = np + ny [12] 


Clearly, there is a degeneracy in energy between the 
states |n + 1,0) and |n,, 1), which have the same 
total occupation number n, but differ in the bosonic 
and fermionic occupation number by one unit. This 
is illustrated in Figure 1. Such pairs of states which 
are degenerate in energy can be transformed into 
each other by the operators 


O=vV2bwbtf,  Ot=vV2bwftb X [13 


The explicit transformations are 


1 
[ny + 1,0) = O|np, 1) 
A (nj + 1)hw 14] 
1 
|p, 1) = O! lr + 1,0) 


2(n, + 1)hw 


The operations [14] are called supersymmetry 
transformations, and the operators O and O' are 
called supercharges. 

As the zero point of energy is arbitrary in systems 
without gravitational interactions, it is customary to 
take £9 —0, that is, ef= —ep; with the normal- 
ization [13], the Hamiltonian H is then the symme- 
trized absolute square of the supercharges: 


00'+0'0 =2H [15] 


| States (ny, nj) 
E/fi» 
4 (3,1) (4,0) 
3 (2,1) (3,0) 
2 (1,1) (2,0) 
1/01) (1,0) 
o (9.9 


0 1 2 8 4 p . 


Figure 1 Spectrum of states of the supersymmetric oscillator. 


whilst 


Q!- Q? =0 116] 
The above relations suffice to guarantee that the 
supercharges (O, O!) are conserved: 


[Q, H] = [Qt H] =0 [17] 
a result re-expressing the degeneracy between states 
with the same n but different n, and ny. The real 
form of the supercharges is 


Q-53(0«40).  0:=(0-0') [18 


In this representation 


H = 01 +05 [19] 


An important observation is that the ground state is the 
only state annihilated by both supersymmetry operators: 


OJO, 0) = 0, Q'|0, 0) =0 [20] 


Indeed, it is the only state with zero energy 
eigenvalue, and only such a state can be an 
invariant supersinglet; all other states have positive 
energy and they necessarily occur in supersymmetry 
pairs. 


Anticommuting Variables 


Fermionic degrees of freedom can be described in a 
pseudoclassical formulation by anticommuting vari- 
ables £ taking values in an infinite-dimensional 
Grassmann algebra: 


E -ee= 0 [21] 


With an anticommuting variable £, we can associate 
a derivative operator 0/0€£, which is an element of 
another Grassmann algebra such that 


O O O 72 
La tro ag 9 [22] 


This extends the original Grassmann algebra to a 


Clifford algebra. Integration with respect to an 
anticommuting variable is defined in the same 


way: 
Jara ¡EST 23 


that is, integration is the same as differentiation for 
anticommuting variables. With these definitions, 
we can represent the fermionic raising and lowering 
operators in terms of anticommuting variables as 


ə 


and the states by 

1, Dg [25] 
Then an arbitrary state takes the form of a linear 
superposition 


(E) = vo[0) + Y1]1) — V(E) 2 po +y [26] 


and the standard positive-semidefinite inner product 
on the state space is represented on the wave 
functions by the double integral 


(|y) = J dedicó &*(E)w(£) = dido + piyi [27 


By construction, f! —& and f =0/0€ are conjugates 
with respect to this inner product: 


j= f se(7 >) (E EVE) [28] 


The real (self-conjugate) forms of the fermion 
operators are, therefore, defined by 


n - (t x) — (sz) 29] 


which satisfy the Pauli-Dirac anticommutation 
relations 


J dé d£ ca (deve 


7j0j + OjO; = 2ój [30] 
By taking the product, we obtain 
03 = —10105 = 1 -2 = 1 — 2N} 
1 


Thus, we may think of the wave functions as two- 
component spinors, the components being labeled 
either by the eigenvalues of the spin operator o3, or 
equivalently by the fermion number Ny, which is a 
projection operator on the states with negative spin. 

The action of the Hamiltonian on a wave function 
V(£) is represented by the integral 


HUE = | Ag AIHE DE) 182 
where H (£, €) is the ordered symbol of the Hamiltonian: 


H(€, €) = ef + hwEE [33] 


This expression is to be considered as the classical 
Hamiltonian of the system. In particular, the 
exponent of the action 


= hi dt (ihEE — H(£, £)) 


1 


=h fu de(icé+ €) + elta t) 34 
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provides the integrand for the path-integral repre- 
sentation of the evolution operator in the quantum 
theory. The proof is not given here; the reader is 
referred to the literature. In passing, note that as the 
anticommuting variables (£,£) are taken to be 
dimentionless, one actually should identify the 
momentum conjugate to € with t= — ib£; in the 
quantum theory, this is replaced by the operator 
—150/0£. | 


Classical Supersymmetry 


The classical action for the supersymmetric oscilla- 
tor with bosonic amplitude x and fermionic ampli- 
tude £ is 


s-f al; x? 2 tikë + we) [35] 


As inferred from the quantum theory, it is a 
combination of a linear harmonic oscillator and a 
fermionic oscillator of the same frequency. A factor 
Vb is also absorbed in € and €; equivalently, we can 
use natural units in which h=1. In the following, 
we use this convention. 

The action [35] is invariant under infinitesimal 
symmetry transformations 


6x = —i(e£ + c£) 


6E = (X + iwx)e, ó£ = P 


(x — iwx)e 
with (€, e) Grassmann-odd parameters. The Noether 
theorem then implies that there are conserved 
fermionic charges 


Q-(p-iwx)é Q=(ptivx)E [37] 


with the momentum defined by p=x. The other 
conserved quantity is the energy, represented by the 
Hamiltonian 


H = - (p? + uw x^) + wEE [38] 


The canonical phase-space formulation is obtained 
by defining brackets of two functions (A, B) on the 
phase space (x,p;€,€) b 


OAOB OAOB 
1A, BT = cda Bp Ox 
(aya (0498 , 0408 
eae t üt GE x) 9^ 


where (— 1)^ is the Grassmann parity of A. In terms 
of these brackets, the time evolution and super- 
symmetry transformations take the form 


A = —{H, A}, =i{eQ+«Q,A} [40] 
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Moreover, the charges O and O satisfy the bracket 
algebra 


(0,0) = —2iH, {O,H}={Q,H}=0 [41] 


Thus, the action [35] is the classical counterpart of 
the quantum theory [9]-[17] in the correspondence 
limit i{A,B} — [A,B], = AB + BA. For these the- 
ories, supersymmetry is rooted in the classical 
transformations [36]. 
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The construction for the supersymmetric oscillator 
can be generalized to other dynamical systems in 
two ways. First, the nature of the interactions as 
represented by the potential can be modified. 
Second, the number of degrees of freedom can be 
varied. This section presents a generalization of the 
supersymmetric oscillator to anharmonic interac- 
tions, obtained by modification of the supercharges 
[37] with a general function ®(x) as follows: 


O=(p—-i®(x))E O=(p+i®(x))E [42] 


The brackets [39] imply the supersymmetry algebra 
[41] with the Hamiltonian 


1 = 
H=5{Q,Q} 

ls Los 1 

"gh pig 

In quantum mechanics, the supercharges become 
operators O and OF upon reinterpretation of (x, p) 
as canonically conjugate operators, and the replace- 
ment € — f! and € — f; this procedure involves no 
ordering ambiguity. The Hamiltonian operator 
defined by the anticommutator of O and O! then 
takes the operator form associated with [43]. With 
the identification 


+50'(x)(G&-€)  |43 


—3 
EZ 


and making use of the (anti)commutation relations 


ffi-fif-1 5 


this Hamilton operator can be written in normal- 
ordered form as 


H=3(00'+0%0) =A'A+O'(x)f'f [46] 


It is positive-semidefinite by construction. All results 
for the supersymmetric oscillator are reproduced 
upon taking ®(x) =wx. 

As the Hamiltonian commutes with the fermion 
number operator Ny, we can label all stationary 


— i$), A! (p +19) [44] 


AA! — A'A = ©'(x), 


states |E,1;) by the energy E and the fermion 
number 7: — (0, 1). Moreover, all states of positive 
energy are degenerate with respect to fermion 
number, as they form pairs related by 
supersymmetry: 


O|E,0) = V2E|E,1), OJE,1) = V2E|E,0) [47] 


Only ground states with Ey = 0 can occur as singlets 
under supersymmetry. The existence of such a 
ground state with fermion number ns amounts to 
the existence of a state |0, nç) satisfying 


A'f|0, ny) = Af'|0, ny) = 0 [48] 
The corresponding wave functions are of the form 


(0, 0) — Wo(x, E) = v- (x) 


o1)-w(9-v( A 
where 1.(x) are solutions of the equations 

Aw_ — 0, Aly, =0 [50] 
These functions are formally given by the 
expressions 


a(x) = Cret h 20% 51] 


For a zero-energy ground state to exist, one of these 
functions must be normalizable. For example, if 
(x) is a polynomial of positive odd degree 2k — 1, 
then, depending on the sign of the coefficient of 
x?2*=1, one of the exponents is bounded, approaching 
zero for x — +00, and as a result becomes square 
integrable. 

If no normalizable wave functions of the form 
[51] exist, the ground state cannot have zero energy 
(Eo > 0) and all states necessarily belong to 
superdoublets. 


Spinning-Particle Mechanics 


Minimal supersymmetric classical or quantum 
mechanics requires equal number of bosonic and 
fermionic coordinates in configuration space (x;, &), 
rather than equal number of bosonic and fermionic 
degrees of freedom in phase space. Specifically, 
minimal free supersymmetric particle mechanics in 


n dimensions is described by the classical 
Lagrangian 
Lie ¡=1,...,1 [52] 
7S" y esa E= 1,..., 


It is invariant modulo a total time derivative under 
infinitesimal supersymmetry transformations 


6X; = —16£;, ÓE; = Xx ¡€ [53] 
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The canonical phase-space formulation is phrased 
in terms of the free-particle momentum and 
Hamiltonian 


pi = Xi, H = lp; [54] 
and the brackets 
_OAOB OAOB . 4 OA OB 
cn m Ox;Op; Op; Ox; ux. DE; DE; 55] 


The supersymmetry transformations are generated 
by the supercharge 


O=pifi, 6A=ie{Q,A} [56] 
with the supersymmetry algebra 
i{Q,Q}=2H, {Q,H}=0 [57 


An important quantity in these models is the bilinear 
(Grassmann-even) antisymmetric tensor 


Tij = 161) [58] 


For a free particle, it is a set of constants of motion 
forming a representation of so(z), the Lie algebra of 
n-dimensional rotations: 


{ oi, Okt} = Ójg oi — Slik — Sik + ilok [S9] 


Therefore, the physical interpretation of oj is that it 
represents the particle spin. For this reason, super- 
symmetric particle mechanics is often called spin- 
ning-particle mechanics. 

Quantum mechanics of the spinning particle has 
the same algebraic structure, with (x; p;) the 
standard canonically conjugate operators, and the 
fermionic coordinates £; represented by the genera- 
tors of a Clifford algebra; the irreducble representa- 


tion in terms of Pauli-Dirac matrices of dimension 
211/21 x 217/21 1s 


1 
E > vg n + yv = 265 [60] 


It follows that the wave functions have 2/"/l 
components, describing different polarization states. 
Furthermore, in minimal supersymmetric quantum 
mechanics, the supersymmetry operator is repre- 
sented by the Dirac operator: 


Q^ rb (v:p) =p? =2H [61] 


Hence, the stationary states of the system solve the 
Dirac equation 


y- pY = v2EV [62] 


The models can, without difficulty, be extended to 
include interactions with external fields. As an 
example, we consider the coupling to a magnetic 


field described by a vector potential Aj(x). An 
extension of the free-particle action [52], invariant 


under the same supersymmetry transformations 
[53], is 


L. b> . Ig 
S= J dt (5+ + PES + qAi(x)x; — AFE) 
[63] 
where F;,=V;A; — V;A; is the field strength. The 


canonical momentum in this model is 
pi = Xi + qAi(x) [64] 


with the result that the canonical expressions for the 
Hamiltonian and supercharge become 


H=}(p;—qAi(x))’, Q=(pi-qAi(x))& [65] 


In the quantum theory, these constants of motion 
become the covariant Laplacian and Dirac operator 
in an external vector potential A;(x). Observe that 
supersymmetry requires the spin to couple to the 
magnetic field with gyromagnetic ratio g=2. Expli- 
citly, the equation of motion for € can be trans- 
formed into an equation for the spin precession: 


&= qk > og—q(Faoy—caFy) [66] 


In three dimensions, this is equivalent to an equation 
in terms of axial vectors: 


Fi = Ej, Be, Cj = Eish => $— —qBxs [67] 


showing that the precession rate of s is given by 
twice the Larmor frequency. 


Extended Supersymmetry 


It is possible to construct theories with more 
supersymmetries by associating with every bosonic 
coordinate several fermionic coordinates. An exam- 
ple is the supersymmetric oscillator and its general- 
izations considered earlier, which has equal number 
of bosonic and fermionic degrees of freedom in 
phase space, rather than equal number of bosonic 
and fermionic coordinates in configuration space. 
The classical phase space, spanned by variables 
(xi, pi; £j E) with i=1,...,”, then has double the 
number of fermionic variables compared to the 
minimal supersymmetric particle models. Such mod- 
els can be constructed for systems with an 
n-dimensional bosonic configuration space. Their 
supercharges take the form 


O = (p; — ib;(x))£;, 


=p MAI 


O = (p; + ið;(x))&; 168] 
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whilst the Hamiltonian becomes 
H =3p; + 59; (x) 
+ i(Vji + Vid) (EE — &&j) [69] 
The supercharges are conserved if the curl of ®;(x) 


vanishes: V;®; —V;®;=0. It follows that at least 
locally there exists a single function W(x) such that 


$;(x) = Vi W(x) [70] 


W(x) is called the superpotential. Defining the 
operators 


Aj = pi—i®(x), Al = pj + i®;(x) 71 
A¡A! — ATA; = Vj; + Vj; 


the supersymmetric quantum theory is defined by 
Q=Aifi,  Q'-Ahf 
H =3(00' + Q'O) 


The Hamiltonian is the direct operator translation of 
the classical expression [69]; its normal-ordered form is 


[72] 


H = AA; - (Vio; + V8) ff [73] 
The total fermion number operator 


Ny = fifi [74] 


(summed over 1) satisfying the commutation 


relations 
IN] =f Ne=- — Us 


commutes with the Hamiltonian. Hence, the station- 
ary states can be labeled by the energy E and the total 
fermion number ; = (0,...,7). The energy spectrum 
being positive semidefinite, all positive-energy states 
occur in pairs of fermion number (my, z+ 1); zero- 
energy states exist only if the equations 


Aff) |0, nj) = Alf, |0, nj) = 0 [76] 


admit a normalizable solution. In this context, the 
vanishing of the curl of ®;(x) is important, as it is a 
necessary condition for the formal solutions 


w(x) = Ci exp (+ J (y) dy) 
=e" [77] 


to be single-valued. If one of them is normalizable, 
there exists a zero-energy ground state with n; =Q 
or nf =n, represented by a wave function: 


[0, 0) — Wo(x,&) = v- (x) 


[0, 7) > UV, (x,E) = v. (x)& ... En [78] 


Alternatively, we can represent the wave functions 
as spinors of dimension 2", on which the fermion 
operators fi and f; act as a 2”-dimensional matrix 
representation of the Clifford algebra with genera- 
tors 7,,4=1,...,2, defined by 


wa-ifü-fÓü) Us 


These operators indeed satisfy the anticommutation 
rule 


y« —-f-^f, 


Ya Te + Yi = 2045 [80] 


Thus, the wave functions have 2” components, as 
compared to the 2!"/2! polarization states of the 
minimal models. 


The Witten Index 


We have noted that for supersymmetric quantum 
systems, like the harmonic and anharmonic super- 
symmetric oscillator, states exist in pairs of different 
fermion number, degenerate in energy, except for 
possibly one or more zero-energy states which are 
superinvariant in the sense that 


O|0,n) = O|0,n) =0 & H|0,n) =0 [81] 


In the Schródinger representation, these states are 
characterized as zero modes of the Dirac operator: 


7 DV=0 [82] 


where D; is an ordinary or field-dependent (e.g., 
covariant) derivative. Clearly, the existence of such 
states can, in some cases, be guaranteed if there is no 
state which can pair up with a given state to form a 
superdoublet. Witten developed a topological char- 
acterization of this condition, encoded in an index 


defined by 
] —tr(- 1) * = nj(E = 0) — (E =0) [83] 


where Ny is the fermion number operator, and 
n, ¢(E=0) are the number of bosonic and fermionic 
zero-energy states. The trace is taken over the 
complete space of states, but as all nonzero energy 
states occur in pairs of a bosonic and a fermionic 
state, their contributions to the trace cancel, having 
opposite sign. Therefore, the trace is actually only 
over the zero-energy states, and counts the number 
of bosonic states with positive sign, and the number 
of fermionic states with negative sign. If the index 
vanishes, I= 0, then any zero-energy states necessa- 
rily exist in equal number of bosonic and fermionic 
states; under perturbations of the potential, these 
states can form pairs and change their energy to a 
positive value. However, if the index does not 
vanish, I 0, then there are states which have no 
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partner of complementary fermion number; these 
states can never get a nonzero energy under changes 
in the parameters of the potential, as long as the 
changes respect supersymmetry. Such systems, there- 
fore, necessarily possess exact zero-energy states 
which are invariant under all supersymmetries. 

Deformations of the potential respecting super- 
symmetry are those obtained by changing the 
parameters in the superpotential. The usefulness of 
this concept is, therefore, that the index for models 
with complicated superpotentials can be computed 
by comparing them with models with simple super- 
potentials having similar topological properties. 

Counting the number of states is not always a 
simple procedure, in particular when the spectrum 
includes continuum states. Therefore, in practice one 
often needs a regularization procedure, by taking the 
trace over the full state space of the exponentially 
damped quantity 


I(8) = tr(—1) “ee 4 [84] 


and taking the limit G6 — 0. The quantity [84] can be 
computed in terms of a path integral with periodic 
boundary conditions for the fermionic degrees of 
freedom. 


Finally, as the wave function representation of 
supersymmetric quantum mechanics [82] links the 
Witten index to the space of zero modes of a Dirac 
operator, in particular cases it can be used to 
describe topological aspects of sigma models and 
gauge theories, and related mathematical quantities 
such as the Atiyah-Singer index. 

More details and references to the original 
literature can be found in the reviews listed in the 
Further Reading section. 


See also: Path-Integrals in Non Commutative Geometry; 
Supermanifolds. 
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Introduction 


A prominent theme of modern condensed matter 
physics is electronic transport — in particular, the 
electrical conductivity — of disordered metallic 
systems at very low temperatures. From the Landau 
theory of weakly interacting Fermi liquids, one 
expects the essential aspects of the situation to be 
captured by the single-electron approximation. 
Mathematical models that have been proposed and 
studied in this context include random Schródinger 
operators and band random matrices. 

If the physical system has infinite size, two distinct 
possibilities exist: the quantum  single-electron 
motion may either be bounded or unbounded. In the 
former case, the disordered electron system is an 
insulator, in the latter case, a metal with finite 
conductivity (if the electron motion is not critical 
but diffusive). Metallic behavior is expected for 
weakly disordered systems in three dimensions; 


insulating behavior sets in when the disorder strength 
is increased or the space dimension reduced. 

The main theoretical tool used in the physics 
literature on the subject is the “supersymmetry 
method” pioneered by Wegner and Efetov (1979-83). 
Over the past 20 years, physicists have applied the 
method in many instances, and a rather complete 
picture of weakly disordered metals has emerged. 
Several excellent reviews of these developments are 
available in print. 

From the perspective of mathematics, however, the 
method has not always been described correctly, and 
what is sorely lacking at present is an exposition of 
how to implement the method rigorously. (Unfortu- 
nately, the correct exposition by Schafer and Wegner 
(1980) was largely ignored or forgotten by later 
authors.) In this article, an attempt is made to help 
remedy the situation, by giving a careful review of 
the Wegner-Efetov supersymmetry method for the 
case of Hermitian band random matrices. 


Gaussian Ensembles 


Let V be a unitary vector space of finite dimension. 
A Hermitian random matrix model on V is defined 
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by some probability distribution on Herm(V), the 
Hermitian linear operators on V. We may fix some 
orthonormal basis of V and represent the elements 
H of Herm(V) by Hermitian square matrices. 

Quite generally, probability distributions are 
characterized by their Fourier transform or char- 
acteristic function. In the present case this is 


O(K) = (elt) 


where the Fourier variable K is some other linear 
operator on V, and (...) denotes the expectation 
value with respect to the probability distribution for 
H. Later, it will be important that, if O(K) is an 
analytic function of K, the matrix entries of K need 
not be from R or C but can be taken from the even 
part of some exterior algebra. 

The probability distributions to be considered in 
this article are Gaussian with zero mean, (H) — 0. 
Their Fourier transform is also Gaussian: 


Q(K) = e (1/2))(K.K) 


with / some quadratic form. We now describe / for a 
large family of hierarchical models that includes the 
case of band random matrices. 

Let V be given a decomposition by orthogonal 
vector spaces: 


V=Vi@0V20---@ Via 


We should imagine that every vector space V; 
corresponds to one site i of some lattice A, and the 
total number of sites is |A|. For simplicity, we take 
all dimensions to be equal: dim V;=--- = dim 
Via, — N. Thus, the dimension of V is N|A|. The 
integer N is called the number of orbitals per site. 
If II; is the orthogonal projector on the linear 
subspace V; C V, we take the bilinear form J to be 


|A| 


J(K, K') = 3 Jj tr(IK IK") 
ij—1 


where the coefficients /; are real, symmetric, and 
positive. This choice of / implies invariance under 
the group 7 of unitary transformations in each 
subspace: 


# = U(V1) x U(V2) x --- x U(Vir,) 


Clearly, Q(K)=Q(UKU™) or, equivalently, the 
probability distribution for H is invariant under 
conjugation H> UHU +, for U € y. 


By evaluating J(E%, E24) = J;6:56,,67" 6^". one sees 


ij vr 


that the matrix entries of H all are statistically 
independent. 

By varying the lattice A, the number of orbitals N, 
and the variances /;, one obtains a large class of 
Hermitian random matrix models, two prominent 
subclasses of which are the following: 


1. For |AJ=1, one gets the Gaussian Unitary 
Ensemble (GUE). Its symmetry group is /4= 
U(N), the largest one possible in dimension 
N= dim V. 

2. If |; — j| denotes a distance function for A, and fa 
rapidly decreasing positive function on R, of 
width W, the choice J;=f(|i— j|) with N=1 
gives an ensemble of band random matrices with 
bandwidth W and symmetry group 7 —U(1)^. 


Beyond being real, symmetric, and positive, the 
variances j; are required to have two extra proper- 
ties in order for all of the following treatment to go 
through: 


e They must be positive as a quadratic form. This is 
to guarantee the existence of an inverse, which we 
_ ¿y 
denote by wi; — US) ij. | | | 
e The off-diagonal matrix entries of the inverse 
must be nonpositive: w; € 0 for i Z j. 


Basic Tools 
Green's Functions 


A major goal of random matrix theory is to 
understand the statistical behavior of the spectrum 
and the eigenstates of a random Hamiltonian H. 
Spectral and eigenstate information can be extracted 
from the Green's function, that is, from matrix 
elements of the operator (z — H) ! with complex 
parameter z € CAR. For the models at hand, the 
good objects to consider are averages of /-invariant 
observables such as 


GY" (z1, 22) = (tr IG - Hy Iz — H)") n 


The discontinuity of Gi! (z) across the real z-axis 
yields the local density of states. In the limit of 
infinite volume (|A| — oo), the function GC (21,22) 
for z1 =E + ie, z2 = E — ie, real energy E, and e > 0 
going to zero, gives information on transport, 
for example, the electrical conductivity by the 
Kubo-Greenwood formula. 

Mathematically speaking, if GE + ie, E — ie) is 
bounded (for infinite volume) in £ and decreases 
algebraically with distance |i—j| at e=0+, the 
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spectrum is absolutely continuous and the eigen- 
states are extended at energy E. On the other hand, 
a pure point spectrum and me eigenstates are 
signaled by the behavior Gt ! s g Ae^Mi- with 
positive Lyapunov exponent A. 


Green's Functions from Determinants 


For any pair of linear operators A, B on a finite- 
dimensional vector space V, the following formula 
from basic linear algebra holds if A has an inverse: 


d 
—det(A--tB)| = det(A) tr(A ! B) 

dt t—0 
Using it with A=z—H and z € CAR, all Green's 
functions can be expressed in terms of determinants; 
for example, 


"- 
det(w — H) det(z — H 4- LES") 
- xA xA dei — H —sE®) det(e — H) 


It is clear that, given a formula of this kind, what 
one wants is a method to handle ensemble averages 
of ratios of determinants. This is what is reviewed in 
the sequel. 


s=1=0 


Determinants as Gaussian Integrals 


Let the Hermitian scalar product of the unitary 
vector space V be written as (1,p2=>(0P1,p2), 
and denote the adjoint or Hermitian conjugate 
of a linear operator A on V by A*. If 
Re A :=(1/2)(4 + A*) > 0, the standard Lebesgue 
integral of the Gaussian function pee (e^ 
makes sense and gives 


where it is understood that we are integrating with 
the Lebesgue measure on (the normed vector space) 
V normalized by [e ^9 —1. The same integral 
with anticommuting :» instead of the (commuting) 
p € V gives 


= det A [3] 


Je (wAY) — det A [4] 


This basic formula from the field theory of 
fermionic particles is a consequence of the integra- 
tion over anticommuting variables actually being 
differentiation: 


fpi Vi, ...) 


l ] g? 


Fermionic Variant 


The supersymmetry method of random matrix 
theory is a theme with many variations. The first 
variation to be described is the “fermionic” one. To 
optimize the notation, we now write dun ¡(H) for 
the density of the Gaussian probability distribution 
of H: 


(F(H)) = / F(H) dux, j(H) 


All determinants and traces appearing below will be 
taken over vector spaces that are clear from the 
context. 


Let 21,...,z, be any set of n complex numbers, 
put z:—diag(zi,...,z,) for later purposes, and 
consider 


Qm (z, J) = Ji [[ detza- Mdu) [S 


The supersymmetry method expresses this average 
of a product of determinants in an alternative way, 
by integrating over a “dual” measure as follows. 
Introducing an auxiliary unitary vector space 
C", one associates with every site i of the lattice 
A an object O; € Herm(C”), the space of Hermitian 
n x n matrices. If dO; for i=1,...,|A| are Lebesgue 
measures on Herm(C"), one puts DỌ =const. x 


[[, dO; and 
d, (Q) = 


The multiplicative constant in DO is fixed by 
requiring the density to be normalized: fdv, 
(O)=1. By completing the square, this Gaussian 
probability measure has the characteristic function 


peros dr, ¡ (O) A e 0/2ili tr K;K; 


where the Fourier variables Kj,..., Kj are n x n 
matrices with matrix entries taken from C or 
another commutative algebra. 

The key relation of the fermionic variant of the 
supersymmetry method is that the expectation of the 
product of determinants [5] has another expression as 


e (1/2? vuU tO; DO [6] 


|A] 


fem (e n- [I [dee -iOO 7 


(i= v—1). The strategy of the proof is quite simple: 
one writes the determinants in both expressions for 
neo as Gaussian integrals over zN|A| complex 
fermionic variables 41,...,V, (each Y, is a vector in 
V with anticommuting coefficients), using the basic 
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formula [4]. The integrals then encountered are 
essentially the Fourier transforms of the distribu- 
tions dun, ¡(H) resp., dw, ¡(O). The result is 


fe à. yg ( Yh A.) AU DE bi Ean WPa Tj) (Wp Mba) 


for both expressions of Q7». In other words, 
although the probability distributions dj. ¡(H) and 
dr, (Q) are distinct (they are defined on different 
spaces), their characteristic functions coincide when 
evaluated on the Fourier variables K = Y”, v, (v, e) 
for H and (K;),5 = (Wa, Hiv) for O;. This establishes 
the claimed equality of the expressions [5] and [7] for 
Qe (a, J). 

What is the advantage of passing to the alternative 
expression by dr, ;(O)? The answer is that, while H 
is made up of independent random variables, the new 
variables O,, called the Hubbard-Stratonovich field, 
are correlated: they interact through the *exchange" 
constants :0;—(] );. If that interaction creates 
enough collectivity, a kind of mean-field behavior 
results. 

For the simple case of GUE (|A| — 1,204; = N/A?) 
with z1 = --- —z, = E, one gets the relation 


(de (E — H)) = J det" (E — iQ) 0/2" dO 


the right-hand side of which is easily analyzed by the 
steepest descent method in the limit of large N. 

For band random matrices in the so-called ergodic 
regime, the physical behavior turns out to be governed 
by the constant mode Q; = -+* = Qj — a fact that can 
be used to establish GUE universality in that regime. 


Bosonic Variant 


The bosonic variant of the present method, due to 
Wegner, computes averages of products of determi- 
nants placed in the denominator: 


ES e n= [Te Y] EA — H)dun.j(H) [8] 


where we now require Jmz, Æ 0 for all a — 1,...,7. 
Complications relative to the fermionic case arise 
from the fact that the integrand in [8] has poles. If 
one replaces the anticommuting vectors Y, by 
commuting ones Yə, and then simply repeats the 
previous calculation in a naive manner, one arrives at 


|A| 


ES (2 f) Aes 


where the integral is still over O; € Herm(C”). The 
calculation is correct, and relation [9] therefore 


(z— Oj)dw,;(O) [9] 


holds true, provided that the parameters zi,...,z, 
all lie in the same half (upper or lower) of the 
complex plane. To obtain information on transport 
properties, however, one needs parameters in both 
the upper and lower halves; see the paragraph 
following [2]. The general case to be addressed 
below is Jmz, > 0 for a=1,...,p, and Jma < 0 
for a — p 4- 1,...,7. Careful inspection of the steps 
leading to eqn [9] reveals a convergence problem for 
0 « p « n. In fact, [9] with O; in Herm(C”) turns 
out to be false in that range. Learning how to 
resolve this problem is the main step toward 
mathematical mastery of the method. Let us there- 
fore give the details. 

If sa :— sgnOmz,, the good (meaning convergent) 
Gaussian integral to consider is 


= [Jer "d 


To avoid carrying around trivial constants, we now 


assume i"-??NÍ^ —1. Use of the characteristic 


function of the distribution for H then gives 


o! SZ P4 Ps 
09095 (z J) = fe (Pre) 


x e — di E adSa (Da; Il; 15; 10] J, Tj) [1 0] 


—1Sa (Za ins H)) 


The difficulty of analyzing this expression stems 
from the “hyperbolic” nature (due to the indefinite- 
ness of the signs s, = +1) of the term quartic in the 


Pas Pa- 


Fyodorov's Method 


The integrand for Q^9* is naturally expressed in 
terms of nxn matrices M; with matrix ele- 
ments (Mi).y = (Za, U;pa). These matrices lie in 
Herm*(C”), that is, they are non-negative as well 
as Hermitian. Fyodorov's idea was to introduce 
them as the new variables of integration. To do 
that step, recall the basic fact that, given two 
differentiable spaces X and Y and a smooth map 
i: X Y, a distribution y on X is pushed forward 
to a distribution v(u) on Y by wv(gu)[f] :— ulf o v], 
where f is any test function on Y. 

We apply this universal principle to the case at 
hand by identifying X with V", and Y with 
(Herm *(C"))^., and v with the mapping that sends 


(1, .., Py) EX to (Mi,..., Miaj) c Y 


by (Mas = (Qa, IIipg). On X= V" we are integrat- 
ing with the product Lebesgue measure normalized 
by [e?«'9» 9) — 1, We now want the push-forward 
of this flat measure (or distribution) by the mapping 
i». In general, the push-forward of a measure is not 
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guaranteed to have a density but may be singular 
(like a Dirac 6-distribution). This is in fact what 
happens if N <m. The matrices M; then have less 
than the maximal rank, so they fail to be positive 
but possess zero eigenvalues, which implies that 
the flat measure on X is pushed forward by v into the 
boundary of Y. For N > n, on the other hand, the 
push-forward measure does have a density on Y; and 
that density is []*, (det Mj)V""dM;, as is seen by 
transforming to the eigenvalue representation and 
comparing Jacobians. The dM; are Lebesgue mea- 
sures on Herm(C”), normalized by the condition 


/ e "M (det Mj" dM; — f cierta — 1 
M;>0 " 


Assembling the sign information for Jmz, in a 
diagonal matrix s :— diag(si,...,5,), and pushing the 
integral over X forward to an integral over Y with 
measure DM:— ||;dM; we obtain Fyodorov's 
formula: 


(bos (z p s. || e Entente 
". J, 
y 
X ehetr(iseMy+(N—n) In Me) DM [11] 


This formula has a number of attractive features. 
One is ease of derivation, another is ready general- 
izability to the case of non-Gaussian distributions. 
The main disadvantage of the formula is that it does 
not apply to the case of band random matrices 
(because of the restriction N > n); nor does it 
combine nicely with the fermionic formula [7] to 
give a supersymmetric formalism, as one formula is 
built on J; and the other on wy. 

Note that [11] clearly displays the dependence on 
the signature of Jmz: you cannot remove the s1,...., Sy 
from the integrand without changing the domain of 
integration Y= (Herm*(C”))'“!, This important 
feature is missing from the naive formula [9]. 

Setting g=n — p, let U(p, q) be the pseudounitary 
group of complex nxn matrices T with inverse 
T !-—sT*s. Since |det T| 2 1 for T € U(p,q), the 
integration domain Y and density DM = [[; dM; of 
Fyodorov's formula are invariant under U(p,q) 
transformations M; — TM;T*, and so is actually the 
integrand in the limit where all parameters z1,...,z, 
become equal. Thus, the elements of U(p,q) are 
global symmetries in that limit. This observation 
holds the key to another method of transforming the 
expression [10]. 


The Method of Scháfer and Wegner 


To rescue the naive formula [9], what needs to be 
abandoned is the integration domain Herm(C”) for 
the matrices O;. The good domain to use was 


constructed by Schafer and Wegner, but was largely 
forgotten in later physics work. 

Writing (Mz),.3=(Go, Uys) as before, consider 
the function 


EM(O) e e A2) w0ytr(sQi+12)/(sQ/+12) -EjtrM,O, [12] 
viewed as a holomorphic function of 


O = (Or, T Qu) € End(C")'^ 


If the Gaussian integral f/F4(O)DO with holo- 
morphic density DO — [[; dO; is formally carried 
out by completing the square, one gets the integrand 
of [10]. This is just what we want, as it would allow 
us to pass to a O-matrix formulation akin to the one 
of the previous section. But how can that formal 
step be made rigorous? To that end, one needs to (1) 
construct a domain on which |Fy(Q)| decreases 
rapidly so that the integral exists, and (2) justify 
completion of the square and shifting of variables. 
To begin, take the absolute value of Fy(Q). 
Putting. (1/2)(Q; +Q*)=:ReQ; and (1/21)(O;— 
O*) =: ImQ;, we have |Fy| 2e '/9i*f2*f5) with 


fi(Q) = >. wjitr(s)mQ; + z)(sS)mQ; +z) + c.c. 


7 


f(Q) = -23 ^ witr(sReQ;)(sReQ;) 


f3(Q) = 4 > tr (m =P sTmz > ws) ReO; 


These expressions suggest making the following 
choice of integration domain for O;(i=1,...,|A)). 
Pick some real constant A > 0 and put 


P D 
eO, = ALTE, ImQ; = P; = 
O P. 
with T; € U(p,q), P^ € Herm(C^), P € Herm(C?). 
The set of matrices O; so defined is referred to as 
the Scháfer-Wegner domain X^. The range of the 
field O-—(Oi,...,OQu) is the direct product 
grs (PA, 
To show that this is a good choice of domain, we 
first of all show convergence of the integral 
f, Fu(Q)DO. The matrices P; commute with s, so 


fi(Q)| A =2Re X | witr(P; + sz) (P; + sz) 


Since the coefficients w;; are positive as a quadratic 
form, this expression is convex (with a positive 
Hessian) in the Hermitian matrices P;. Second, the 
function 


AO), = — 2°) witr(TiT;) En 


ij 
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is bounded from below by the constant —2A?nYjw;;. 
This holds true because wj is negative for i Æ j, and 
because T¡T? > 0 and the trace of a product of two 
positive Hermitian matrices is always positive. 


Third, 


BO), =4A $- tr (m + sJmz >. ws] TT; 
r ] 


Is positive, as (...) is positive Hermitian. As long as 
sJmz > 0, the function f; goes to infinity for all 
possible directions of taking the T; to infinity on 
U(p, q). 

Thus, when the matrices Q; are taken to vary on 
the Scháfer-Wegner domain X57, the absolute value 
Em] =e (0/90: 52*5) decreases rapidly at infinity. 
This establishes the convergence of f, Fy(Q)DQ. 

Next, let us count dimensions. The mapping 
T= TT* for T € U(p,q) — G is invariant under 
right multiplication of T by elements of the unitary 
subgroup H:— U(p) x U(q) - it is called the “Cartan 
embedding" of G/H into G. The real manifold G/H 
has dimension 2pq and so does its image under the 
Cartan embedding. Augmenting this by the dimen- 
sion of Herm(C?) and Herm(C?) (from P;), one gets 
dimX$^ —2pq + p? + q? = (p + q) =n’, which is as 
it should be. 

Finally, why can one shift variables and do the 
Gaussian integral over O (with translation-invariant 
DO) by completing the square? This question is 
legitimate as the Scháfer-Wegner domain X$’? lacks 
invariance under the required shift, which is 
O; Q; — isz + 2 ;;JijsMjs. 

To complete the square in [12], introduce a 
parameter 7 € [0,1] and consider the family of shifts 


Qj; Qi + t(—isz T EJ is Mjs) 


For fixed f, this shift takes ^ = (Xt 4^ into another 
domain, (t). Inspection shows that the function 
[12] still decreases rapidly (uniformly in the M;) on 
^(t), as long as t< 1. Without changing the 
integral, one can add pieces to 7 (t) (for t < 1) at 
infinity to arrange for the chain + — > (t) to be a 
cycle. Because 7 (t) is homotopic to 7 (0) — 7, this 
cycle is a boundary: there exists a manifold 7(t) of 
dimension dim» +1 such that 07(t)= + — + (t). 
Viewed as a holomorphic differential form of degree 
(n2|A|,0) in the complex space End(C")^l, the 
integrand w:=Fy(Q)DO is closed (ie., dw= 0). 
Therefore, by Stokes’ theorem, 


fo-| w= | v= | dw = 0 
d e (t) Or (t) v(t) 


which proves f, Fu(O)DO — f, Fu(Q)DQ, inde- 
pendent of £. (This argument does not go through 
for the nonrigorous choice sQ;:— T;P;T;! usually 
made!) 

In the limit ? — 1, one encounters the expression 


f NC f dvi) 


x e (1/2) EiJitr(sMisM;)riV,tr(seM,) 

with dz, ; as in [6]. The normalization integral over 
7 is defined by taking the Hermitian matrices P; to 
be the inner variables of integration. The outer 
integrals over the T; then demonstrably exist, and 
one can fix the (otherwise arbitrary) normalization 
of DO by setting f, d», j(isQ) — 1. Making that 
choice, and comparing with [10], one has proved 


(e E / | (f Fut), i.e (Q)DQ ) 
JPG 7 


The final step is to change the order of integration 
over the O- and y-variables, which is permitted 
since the O-integral converges uniformly in g. 
Doing the Gaussian y-integral and shifting O,— 
O, — isz, one arrives at the Schafer-Wegner formula 
for Sos 


QPS (2 wo!) = Í e 1/2 Sjneyte(sQisQ,) 


x e Nutr In(Qe—isz) NO [13] 


which is a rigorous version of the naive formula [9]. 
Compared to Fyodorov's formula, it has the dis- 
advantage of not being manifestly invariant under 
global hyperbolic transformations Q; — TO;T* (the 
integration domain >% is not invariant). Its best 
feature is that it does apply to the case of band 
random matrices with one orbital per site (N — 1). 


Supersymmetric Variant 


We are now in a position to tackle the problem of 
averaging ratios of determinants. For concreteness, 
we shall discuss the case where the number of 
determinants is two for both the numerator and the 
denominator, which is what is needed for the 
calculation of the function GE? (z1, 22) defined in 
eqn [2]. We will consider the case of relevance for 
the electrical conductivity: zı = E + ie, z2 = E — ie, 
with E € R and e > 0. 

A O-integral formula for GE (21,22) can be 
derived by combining the fermionic method for 


(det(a; — H)det (zs = Ff + HE) Y 
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with the Schafer-Wegner bosonic formalism for 
(der! (z —H- t Ej? ) det (22 — H)) 


and eventually differentiating with respect to t1, t2 at 
ti =t — 0 and summing over a, b; see the subsection 
*Green's functions from determinants." All steps are 
formally the same as before, but with traces and 
determinants replaced by their supersymmetric 
analogs. Having given a great many technical details 
in the last two sections, we now just present the 
final formula along with the necessary definitions 
and some indication of what are the new elements 
involved in the proof. 

Let each of OBB, Orr, Onr, and Org stand for a 
2x2 matrix. If the first two matrices have 
commuting entries and the last two anticommuting 
ones, they combine to a 4 x 4 supermatrix: 


9 (8 a) 


Relevant operations on 


supertrace, 


supermatrices are the 


StrO = trOpp — trOrr 


and the superdeterminant, 


___det(Qsp) —— 
det(Orr — OrsOnsa | Ope) 


These are related by the identity Sdet— exp o Stroln 
whenever the superdeterminant exists and is 
nonzero. 

In the process of applying the method described 
earlier, a supermatrix O; gets introduced at every 
site 1 of the lattice A. The domain of integration for 
each of the matrix blocks (Oj)gg(i=1,...,|A]) is 
taken to be the Scháfer-Wegner domain x (with 
some choice of A > 0); the integration domain for 
each of the (O;)pp is the space of Hermitian 2 x 2 
matrices, as before. 

Let Epp, be the 4 x 4 (super)matrix with unit entry 
in the upper-left corner and zeros elsewhere; simi- 
larly, Ez? has unity in the lower-right corner and 
zeros elsewhere. Putting s—diag(1, — 1,1, 1) and 
z= diag(z1, 22, 21, 22), the supersymmetric (integral 
formula for the generating function of e - 
obtained by combining the Schafer-Wegner bosonic 
method with the fermionic variant — is written as 


det(z; — H) det(z; — H + t; ESP) 
det(z; — H — 11 EP) det(z2 — H) 


= f Doe 1/2)YqwgjStr(sOtsQ)) 


SdetO — 


TT te. Bil pba ; :22 «> Fab 
: E (X.«(Q,—isz)o Es Hit E] OE? it; Et GE; ) [14] 


where the second supertrace includes a sum over 
sites and orbitals, and on setting tı = t; = 0 becomes 


e- NES ln(Q,-is) — TT Sdet™ (Q, — ise) 
r 


The superintegral “measure” DO — [[, DO, is the 
flat Berezin form, that is, the product of differentials 
for all the commuting matrix entries in (O,)gg and 
(O,)pp, times the product of derivatives for all the 
anticommuting matrix entries in (O,)pp and (O,)pp. 
To prove the formula [14], two new tools are 
needed, a brief account of which is as follows. 


Gaussian Superintegrals 


There exists a supersymmetric generalization of the 
Gaussian integration formulas given in the subsec- 
tion “Determinants as Gaussian integrals": if 
A,D(B,C) are linear operators or matrices with 
commuting (resp., anticommuting) entries, and 
HeA > 0, one has 


Verification of this formula is straightforward. 
Using it, one writes the last factor in [14] as a 
Gaussian superintegral over four vectors: 1, 2, 1, 
and v». The integrand then becomes Gaussian in the 
matrices O,. 


Shifting Variables 


The next step in the proof is to do the “Gaussian” 
integral over the supermatrices O,. By definition, in 
a superintegral, one first carries out the Fermi 
integral, and afterwards the ordinary integrations. 
The Gaussian integral over the anticommuting parts 
(O,)pp and (O,)pp is readily done by completing the 
square and shifting variables using the fact that 
fermionic integration is differentiation: 


f déf(E— €)— x (£-&)— J def (£) 


Similarly, the Gaussian integral over the Hermitian 
matrices (O,)pp is done by completing the square 
and shifting. The integral over (O,)pp, however, is 
not Gaussian, as the domain is not R" but the 
Scháfer-Wegner domain. Here, more advanced 
calculus is required: these integrations are done by 
using a supersymmetric change-of-variables theorem 
due to Berezin to make the necessary shifts by 
nilpotents. (There is not enough space to describe 
this here, so please consult Berezin's (1987) book.) 
Without difficulty, one finds the result to agree with 
the left-hand side of eqn [14], thereby establishing 
that formula. 
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Approximations 


All manipulations so far have been exact and, in 
fact, rigorous (or can be made so with little extra 
effort). Now we turn to a sequence of approxima- 
tions that have been used by physicists to develop a 
quantitative understanding of weakly disordered 
quantum dots, wires, films, etc. While physically 
satisfactory, not all of these approximations are 
under full mathematical control. We will briefly 
comment on their validity as we go along. 


Saddle-Point Manifold 


We continue to consider Gi} (E + ie, E — ie) and 
focus on E=0 (the center of the energy band) for 
simplicity. By varying the exponent on the right- 
hand side of [14] and setting the variation to zero 
one obtains, for ti =t — 0, 


= wjsQjs - NO; ' = 0 
| 


which is called the saddle-point equation. 
Let us now assume translational invariance, 
w =f(|i— jil). Then, if A—4/N/Xgwj;, the saddle- 


point equation has ;-independent solutions of the 


form 
qeg 0 
¡=A 
Q ( 0 zj 


where for qrp there are three possibilities: two 
isolated points qrr = +1 (unit matrix) coexist with 
a manifold 


d cos 01 
FF =|. Zid 
sinf,e '?! 


which is two-dimensional; whereas the solution 
space for q consists of a single connected 
2-manifold: 


sin 0, e” 
— cos £i |13} 


— cosh 6o 
3B E = mad 1 
sinh ĝo e "^ 


The solutions qrp = +1 are usually discarded in the 
physics literature. (The argument is that they break 
supersymmetry and therefore get suppressed by 
fermionic zero modes. For the simpler case of the 
one-point function [1] and in three space dimen- 
sions, such suppression has recently been proved by 
Disertori, Pinson, and Spencer.) Other solutions for 
dpp are ruled out by the requirement ReO, > 0 for 
the Scháfer-Wegner domain. 

The set of matrices [16] and [15] — the “saddle- 
point manifold" — is diffeomorphic to the product of 
a 2-hyperboloid H? with a 2-sphere S^. Moving 


cosh ĝo [16] 


sinh 0, e'% 


along that manifold M := H? x S? leaves the O-field 
integrand [14] unchanged (for z1 —z» = tı = t» — 0). 

One can actually anticipate the existence of such a 
manifold from the symmetries at hand. These are 
most transparent in the starting point of the 


formalism as given by the characteristic function 
(en) with 


Ky = (£1, Hei) — (G2, Hy2) + (Vi, Hy) 
(V2, Hv») 


The signs of this quadratic expression are what is 
encoded in the signature matrix s = diag(1, — 1, 1, 1) 
(recall that the first two entries are forced by Jmz, > 
O and Jmz <0). The Hermitian form Ky is 
invariant under the product of two Lie groups: 
U(1, 1) acting on the y’s, and U(2) acting on the vs. 
This invariance gets transferred by the formalism to 
the O-side; the saddle-point manifold M is in fact an 
“orbit” of the group action of G :— U(1, 1) x U(2) 
on the O-field. In the language of physics, the 
degrees of freedom of M correspond to the Gold- 
stone bosons of a broken symmetry. 

Ky; also has some supersymmetries, mixing y’s 
with vs. At the infinitesimal level, these combine 
with the generators of G to give a Lie superalgebra 
of symmetries g:— u(1, 1|2). One therefore expects 
some kind of saddle-point supermanifold, say ^, on 
the O-side. 

/ can be constructed by extending the above 
solution qo:— diag(qgs,qrr) of the dimensionless 
saddle-point equation sqs—q to the full 4 x 4 
supermatrix space. Putting q =q0 + qı with 


(0 r3 
n - (à 0 


and linearizing in q1, one gets 


sqis = —q9 lido [17] 


The solution space of this linear equation for q, has 
dimension 4 for all gg € M. Based on it, one expects 
four Goldstone fermions to emerge along with the 
four Goldstone bosons of M. 

For the simple case under consideration, one can 
introduce local coordinates and push the analysis to 
nonlinear order, but things get quickly out of hand 
(when done in this way) for more challenging, 
higher-rank cases. Fortunately, there exists an 
alternative, coordinate-independent approach, as 
the mathematical object to be constructed is 
completely determined by symmetry! 


Riemannian Symmetric Superspace 


The linear equation [17] associates with every point 
x € M a four-dimensional vector space of solutions, 
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V,. As the point x moves on M the vector spaces V, 
turn and twist; thus, they form what is called a 
vector bundle V over M. (The bundle at hand turns 
out to be nontrivial, i.e., there exists no global 
choice of coordinates for it.) 

A section of V is a smooth mapping v:M— V 
such that v(x) € V, for all x € M. The sections of 
V are to be multiplied in the exterior sense, as they 
represent anticommuting degrees of freedom; 
hence the proper object to consider is the exterior 
bundle, AV. 

It is a beautiful fact that there exists a unique 
action of the Lie superalgebra g on the sections of 
^V by first-order differential operators, or deriva- 
tions for short. (Be advised however that this 
canonical g-action is not well known in physics or 
mathematics.) 

The manifold M is a symmetric space, that is, a 
Riemannian manifold with G-invariant geometry. 
Its metric tensor, g, uniquely extends to a second- 
rank tensor field (still denoted by g) which maps 
pairs of derivations of AV to sections of AV, and is 
invariant with respect to the g-action. This collec- 
tion of objects — the symmetric space M, the 
exterior bundle AV over it, the action of the Lie 
superalgebra g on the sections of AV, and the 
g-invariant second-rank tensor g - form what 
the author calls a *Riemannian symmetric super- 
space," /. 


Nonlinear Sigma Model 


According to the Landau-Ginzburg-Wilson (LGW) 
paradigm of the theory of phase transitions, the 
large-scale physics of a statistical mechanical system 
near criticality is expected to be controlled by an 
effective field theory for the long-wavelength excita- 
tions of the order parameter of the system. 

Wegner is credited for the profound insight that 
the LGW paradigm applies to the random matrix 
situation at hand, with the role of the order 
parameter being taken by the matrix O. He argued 
that transport observables (such as the electrical 
conductivity) are governed by slow spatial variations 
of the O-field inside the saddle-point manifold. 
Efetov skilfully implemented this insight in a super- 
symmetric variant of Wegner's method. 

While the direct construction of the effective 
continuum field theory by gradient expansion of 
[14] is not an entirely easy task, the outcome of the 
calculation is predetermined by symmetry. On 
general grounds, the effective field theory has to be 
a nonlinear sigma model for the Goldstone bosons 
and fermions of |. 7: if (o^) are local coordinates for 


the bundle V with metric gAg(ó) = g(0/09^, 0/00"), 


the action functional is 


S=y / doo er(d)0,0* 


The coupling parameter o has the physical meaning 
of bare (i.e., unrenormalized) conductivity. In the 
present model c = NW?4?-7, where W is essentially 
the width of the band random matrix in units of the 
lattice spacing a (the short-distance cutoff of the 
continuum field theory). S is the effective action in 
the limit zı = z2. For a finite frequency w= 21 — 22, a 
symmetry-breaking term of the form iwy f d? xf (à), 
where v — N(xA) *a is the local density of states, 
has to be added to S. 

By perturbative renormalization group analysis, that 
is, by integrating out the rapid field fluctuations, one 
finds for d = 2 that o decreases on increasing the cutoff 
a. This property is referred to as “asymptotic freedom” 
in field theory. On its basis one expects exponentially 
decaying correlations, and hence localization of all 
states, in two dimensions. However, a mathematical 
proof of this conjecture is not currently available. 

In three dimensions and for a sufficiently large bare 
conductivity, the renormalization flow goes toward 
the metallic fixed point (c — oo), where G-symmetry 
is broken spontaneously. A rigorous proof of this 
important conjecture (existence of disordered metals 
in three space dimensions) is not available either. 


Zero-Mode Approximation 


For a system in a box of linear size L, the cost of 
exciting fluctuations in the sigma model field is 
estimated as the Thouless energy Ey, =0/vL?. In the 
limit of small frequency, |w| < Erh, the physical 
behavior is dominated by the constant modes 
ó^(x)-— $^ (independent of x). By computing the 
integral over these modes, Efetov found the energy- 


level correlations in the small-frequency limit to be 
those of the GUE. 


See also: Random Matrix Theory in Physics; Symmetry 
Classes in Random Matrix Theory. 
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Introduction 


Many systems of partial differential equations 
arising in mathematical physics and differential 
geometry are quasilinear: the top-order derivatives 
enter only linearly. They may be cast in the form 
of first-order systems by introducing, if needed, 
derivatives of the unknowns as additional unknowns. 
For such systems, the theory of symmetric-hyperbolic 
(SH) systems provides a unified framework for 
proving the local existence of smooth solutions if 
the initial data are smooth. It is also convenient for 
constructing numerical schemes, and for studying 
shock waves. Despite what the name suggests, the 
impact of the theory of SH systems is not limited to 
hyperbolic problems, two examples being Tricomi's 
equation, and equations of Cauchy-Kowalewska 
type. 

Application of the SH framework usually requires a 
preliminary reduction to SH form (“symmetrization”). 

After comparing briefly the theory of SH systems 
with other functional-analytic approaches, we col- 
lect basic definitions and notation. We then present 
two general rules, for symmetrizing conservation 
laws and strictly hyperbolic equations, respectively. 
We next turn to special features possessed by linear 
SH systems, and give a general procedure to prove 
existence, which covers both linear and nonlinear 
systems. We then summarize those results on shock 
waves, and on blow-up singularities, which are 
related to SH structure. Examples and applications 
are collected in the last section. 

The advantages of SH theory are: a standardized 
procedure for constructing solutions; the availability 
of standard numerical schemes; a natural way to 
prove that the speed of propagation of support is 
finite. On the other hand, the symmetrization 
process is sometimes ad boc, and does not respect 


the physical or geometric nature of the unknowns; 
to obviate this defect to some extent, we remark that 
symmetrizers may be viewed as introducing a new 
Riemannian metric on the space of unknowns. The 
search for a comprehensive criterion for identifying 
equations and boundary conditions compatible with 
SH structure is still the object of current research. 
The most important fields of application of the 
theory today are general relativity and fluid 
dynamics, including magnetohydrodynamics. 


Context of SH Theory in Modern Terms 


The basic reason why the theory works may be 
summarized as follows for the modern reader; the 
history of the subject is, however, more involved. 

Let H be a real Hilbert space. Consider a linear 
initial-value problem du/dt + Au = 0; u(0) = uo € H, 
where A is unbounded, with domain D(A). By 
Stone's theorem, one can solve it in a generalized 
sense, if the unbounded operator A satisfies A + 
A* — 0. This condition contains two ingredients: a 
symmetry condition on A, and a maximality condi- 
tion on D(A), which incorporate boundary condi- 
tions (von Neumann, Friedrichs). Semigroup theory 
(Hille and Yosida, Phillips, and many others) 
handles more general operators A: it is possible to 
solve this problem in the form u(t) = S(t)uo for t > 0, 
where (S(t)),>y is a continuous contraction semi- 
group, if and only if (Au,u) > 0, and equation x + 
Ax=y has a solution for every y in H (this is a 
maximality condition on D(A)). One then says that 
A is maximal monotone. For such operators, A + 
A* 20. SH systems are systems Ou, + Au — F, 
satisfying two algebraic conditions ensuring for- 
mally that A+ A* is bounded, and that O is 
symmetric and positive definite. This algebraic 
structure enables one to solve the problem directly, 
without explicit reference to semigroup theory. 
Precise definitions are given next. 

We assume throughout that all coefficients, 
nonlinearities, and data are smooth unless otherwise 
specified. 
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Definitions 


Consider a quasilinear system 


Mé" (x, u)O,uP = N^ (x, u) [1] 
where Hu — (u^)4 =1,...,.m X = 
O/Ox^. The components of u may be real or 
complex. We follow the summation convention on 
repeated indices in different positions; x? =t may be 
thought of as the evolution variable; we write 
x —(t,x), with x= (x?,...,x”). Indices A, B,... run 
from 1 to m, indices j,k,... from 1 to n, and Greek 
indices from 0 to n. The complex conjugate of u^ is 


written 7“. 


grees 


e Equation [1] is symmetrizable if there are func- 
tions cAp(x,u) such that 


, ires C 
Mig :— cAcMS^ 


satisfies the condition M^, = M9, for every a. 

e It is symmetric if it is symmetrizable with 
TAB = ÓAB- 

e [t is symmetric-hyperbolic with respect to ka if it 
is symmetric and if k, M34, is positive definite: 
ka Me CAS? «0 for £— (£^) 40, 


Thus, a symmetrizer (cag) gives rise to a 
Riemannian metric (k,cAcM$^) on the space of 
unknowns, independent of any Riemannian struc- 
ture on x-space. The system is SH with respect to x? 
E hy m. 

The simplest class of SH systems is provided by 
real semilinear systems of the form 


AP (x)Oyu + A'(x)Oju = N(x, u) [2] 


where the A^ are real symmetric matrices, A" is 
symmetric and positive definite, and ka = 6,9. Writ- 
ing A? — P?, with P symmetric and positive definite, 
one finds that v = Pu solves a SH system with A? = I 
(identity matrix). 

Conservation laws (with “reaction” or “source” 
term N^) are usually defined as quasilinear systems 
of the form 


Bos, u) = N^ (x, u) [3] 


They are common in fluid dynamics and combus- 
tion. They are limiting cases of nonlinear diffusion 
equations of the typical form 


daf“ (x,u) = NA (x,u) +0 (BR apu) ^ [4] 


The determination of the form of the coefficients 
Br is a nontrivial modeling issue; they may reflect 
varied physical processes such as heat conduction, 
viscosity, or bulk viscosity. They may depend on x, 
u, and the derivatives of u. The simplest case is 


BAR — DikgA with (D'*) diagonal. Some authors 
require the symmetry condition 


bacBO* = poB" [5] 


Equations in which f^" — 4^66 are called reaction- 
diffusion equations; they arise in physical and 
biological problems in which chemical reactions 
and diffusion phenomena are combined, and in 
population dynamics. 

A conservation law is symmetric if and only if 
Of^^ [OuP is symmetric in A and B, which means 
that there are, locally, functions g°(x,u) such that 
pan - Dg” /0u^. 

A more fundamental derivation of conservation 
laws would take us beyond the scope of this survey. 


Symmetrization 


Two general procedures for symmetrization are 
available: one for conservation laws, the other for 
semilinear strictly hyperbolic problems. 


Conservation Laws with a Convex Entropy 


Consider, for simplicity, a conservation law of 
the form 


ðu“ + Of ^ (u) — 0 [6] 


We, therefore, assume that the f^^ —/^^(u) and 
f°(u) =u“. We show that the following three 
statements are equivalent locally: (1) there is a 
strictly convex function U(u) such that oag = 
0?U/0u*0u* is a symmetrizer; (2) eqn [6] implies a 
scalar relation of the form à, U^ — 0, with U? strictly 
convex; and (3) there is a change of unknowns 
VA =vA(u) such that the system satisfied by v — (va) 
is SH and (Ov4/OuP) is positive definite. 

In fluid dynamics, U? may sometimes be related 
to specific entropy, and U’ to entropy flux. For this 
reason, if (2) holds, one says that U is an entropy 
for eqn [6], and that (U?, U’) is an entropy pair. A 
system may have several entropies in this sense; this 
fact is sometimes useful in studying convergence 
properties of approximate solutions of eqn [6]. 

Let us now prove the equivalence of these 
properties. 

Assume first (3) there are new unknowns 
UVA-—UvA(u) and functions g“(v) such that 
{4° = Ag /Ov4. One finds that if eqn [6] holds, 


Je 


O,U* =0 where U” = va 08 — g® [7] 
OVA 


Furthermore, we have f^? — 4^; therefore, eqn [7] 
gives: U? — vAu^ — gl, so that U? is the Legendre 
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transform (familiar from mechanics) of g?. It follows 
that va=09U%/0u*. Finally, (dv4/du®) — (0? U? / 
Ou^OuP) is positive definite, and U? is strictly 
convex. 

We have proved that (3) implies (2). Next, assume 
(2): the entropy equality U, + ðU’ — 0 holds identi- 
cally — and not just for the solution at hand. Using 
[6], we find 


QU Of" OU). y 
= - Qu^ Qu Y gus | ^ 
Assumption (2), therefore, means that U is strictly 
convex and satisfies 


aU Of^i QUI 
u^ Du) — ub 8] 


Now, letting v4 =09U/0u* and g(v)—vAf^! — UY, 
we find 


Ou^ due due 
=f" 9 


Let oag — 0^U/Ou^OuP. Since U is strictly convex, 
(oag) is positive definite, and so is its inverse. We 
have now proved (3). Note that u*=0g" /0va, 
where g"(v)— u^v4 — U(u) is the Legendre trans- 
form of U. 

Next, using eqn [9], and the relations cag = 
Ov, / Ou? = Ovp/Ou^, we find 


Og Qj, [OU Of _ OU") au 
OVA OVA 


Ux CAB [O,u? + af" 
Ovp O^ g AC 
Qu^ ðvB uE " 


Og! mC 
Ae Om 


= cApOuP + 
= CApO,uP + 


which is SH; therefore, 74g is a symmetrizer for eqn 
[6], and (1) is proved. Thus, (2) implies (1) and (3). 

Finally, if (1) holds, c4cOf 9 /OuP is symmetric in 
A and B. It follows that 


E . af, ou pfe 
dul |Ou^ ðuB| ^C uB ` Qu^ OuPOuC 


is symmetric in B and C, so that there are, locally, 
functions U’ such that eqn [8] holds. Therefore, 
(U,U/) is an entropy pair, and we see that (1) 
implies (2). 

This completes the proof of the equivalence of (1), 
(2), and (3). 


Strictly Hyperbolic Equations 


Consider the scalar equation Pf — g(t, x), where P is 
the linear operator 


N-1 | 
P= — Y pn-;¡(t,x)0) 
j=0 


of order N. Let A = (1 — A)!/*, where A is the Laplace 
operator on the space variables. Then u = (4^), where 
u^ = 0^ ! AN^^f for A—1,..., N, solves a first-order 
pseudodifferential system of the form 


u,—Lu=G 


If P is strictly hyperbolic, the principal symbol 
a;(t,x,€) of L has a diagonal form with real 
eigenvalues A,(t,x,€), and there are projectors 
pilt, x, Ep? =p;) which commute with a;, such that 
p= ? ¡Di and r= d Ajpj. Let ro — 2D; bj, and 


ro(D) the corresponding operator. Equation 
ro(D)0,u — ro(D)Lu = ro(D)G 


is formally SH in the following sense: ro is positive 
definite and roa, is Hermitian. 


Linear Problems 
Consider a linear system 
Lu = Q(t, x)ð;u + A (t, x)O;u + B(t,x)u 
= f(t, x) 10] 


We assume that O and the A’ are real and 
symmetric, O > c with c positive, and all coeffi- 
cients and their first-order derivatives are bounded. 


Energy Identity 


Multiplying the equation by u” (transpose of u), one 
derives the “energy identity” 


9,(u* Qu) + 0,(u' Alu) + u" Cu = 2u'f(t,x) [11] 


where C—2B — 0,Q — Al. C is not necessarily 
positive. However, v:— u exp (—At) satisfies a linear 
SH system for which C is positive definite if A is 
large enough. 


Propagation of Support 


A basic property of wave-like equations is finite 
speed of propagation of support: if the right-hand 
side vanishes, and if the solution at time 0 is 
localized in the ball of radius r, then the solution 
at time f£ is localized in the ball of radius r + ct for a 
suitable constant c. 

This property also holds for SH systems. To see 
this, let us consider the set where a solution u 
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vanishes: if the initial condition vanishes for |x| < R, 
we claim that u at some later time vanishes for |x| € 
R — t/a, for a large enough. 

Indeed, let us integrate the energy identity on a 
truncated cone [:={|x| € a(to — t)/to;0 € t € ti) 
with tı < tg. The boundary of I consists of three 
parts: OT — Og U Qı US, where Qo and €), represent 
the portions of the boundary on which t=0 and t, 
respectively. The outer normal to $ is proportional 
to (a, tox’ /|x|). Let E(s) denote the integral of u! Ou 
on [M{t=s}. Integrating eqn [11] by parts, we 
obtain 


E(ti) - E(0) + | "duds 


= / / (2u' f - u' Cu)dt dx [12] 


where ® is proportional to aQ + to >>, x/ A! /|x]. 
Take a so large that ® is positive definite. The 
integral over S is then non-negative. If C is positive 
definite and f = 0, so that E(0)=0, we find that 
E(t;) € 0. Since O is positive definite, this implies 
u = 0 on Q1, as claimed. 


A Numerical Scheme 


System Lu —f may be discretized, for example, by 
the Lax—Friedrichs method: let h be the discretiza- 
tion step in space, and k the time step; write 
rnéltx)-—ux'...23 bh.) (translation in 
the j direction). One replaces Ou by the centered 
difference in the j direction: (rju — 7; 'u)/2h; and the 
time derivative by 


PTEN T 5: Y ut. x) + r'u(t,x))]/k [13 


For consistency of the scheme, we require k/h = à > 0 
to be fixed as k and þh tend to zero; stability then 
holds if A is small. 


Nonlinear Problems and Singularities 


We give a simple setup for proving the existence of 
smooth solutions to SH systems for small times. 
Such solutions may develop singularities. We limit 
ourselves to two types of singularities, on which SH 
structure provides some information: jump disconti- 
nuities and blow-up patterns. Caustic formation is 
not considered. 


Construction of a Smooth Solution 


Consider a real SH system (eqn [1]). Recall that a 
function of x belongs to the Sobolev space H? if its 
derivatives of order s or less are square-integrable. 


One constructs a solution defined for £ small, which 
is in H?, s>n/2+1, as a function of x, by the 
following procedure: 


(1) Replace spatial derivatives by regularized opera- 
tors, which should be bounded in Sobolev 
spaces; the regularized equation is an ODE in 
H5; let u- be its solution. 

(2) Write the equation satisfied by derivatives of 
order s of u., and apply the energy identity to it. 

(3) Find a positive T such that the solution is 
bounded in H? for |t| € T, uniformly in e; this 
implies a C! bound. 

(4) Prove the convergence of the approximations 
in L^. 

(5) Prove the continuity in time of the H? norm; 
conclude that the u- tend to a solution in 
C(—T, T; H*). 


The result admits a local version, in which 
Sobolev spaces are replaced by Kato's *uniformly 
local" spaces. Uniqueness of the solution is proved 
along similar lines. We do not attempt to identify 
the infimum of the values of s for which the Cauchy 
problem is well-posed. 


Jump Discontinuities: Shock Waves 


A "shock wave" is a weak solution of a system of 
conservation laws admitting a jump discontinuity. 
By definition, weak solutions satisfy, for any smooth 
function @4(x) with compact support, 


/ / (f^^0,04 + N da} dt dx = 0 


The theory of shock waves is an attempt to 
understand solutions of conservation laws which are 
limits of solutions of diffusion equations; the hope is 
that the influence of second-derivative terms is 
appreciable only near shocks, and that, for given 
initial data, there is a unique weak solution of the 
conservation law which may be obtained as such a 
limit, if modeling has been done correctly. This 
problem may be difficult already for a single shock 
(“shock structure"). 

The theory of shock waves follows the one- 
dimensional theory closely. We therefore describe 
the main facts for a conservation law in one space 
dimension (u — u(t, x)): 


Ou + O,f (u) = 0 


If a shock travels at speed c, the weak formulation 
of the equations gives the Rankine-Hugoniot rela- 
tion c[u]=|[f(u)], where square brackets denote 
jumps. There may be several weak solutions having 
the same initial condition. One restricts solutions by 
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making two further requirements: (1) the system 
admits an entropy pair (U, F) with a convex entropy 
and (2) to be admissible, weak solutions must be 
limits of *viscous approximations" 


Ou + Oxf (U) = etu 


as £— 0. One then finds easily that the entropy 
equality (9,U + O,F — 0) must be replaced, for such 
weak solutions, by the entropy condition: 9,U + 
ôF < 0 in the weak sense. This condition admits a 
concrete interpretation if the gradient of each 
characteristic speed is never orthogonal to the 
corresponding right eigenvector (“genuine nonli- 
nearity"); in that case, characteristics must impinge 
on the shock (“shock inequalities”). 

For the equations of gas dynamics with polytropic 
law (pv’=const.), there is a unique solution with 
initial condition u =u; for x «0,u—u, for x > 0, 
where u; and u, are constant (“Riemann problem") 
which satisfies the entropy condition, provided |z; — u,| 
is small. More generally, if the equation of state 
p — p(v,s) > 0 satisfies Op/Ov < 0 and 0*p/0v? > 0, 
the shock inequalities are equivalent to the fact that 
the entropy increases after the passage of a shock 
with |u; — u,| small. 

On the numerical side, one should mention: 
(1) the widely used idea of upstream differencing; 
(2) the Lax-Wendroff scheme, the complete analysis 
of which requires tools from soliton theory; and 
(3) the availability of general results for dissipative 
schemes for SH systems. 

Recent trends include: (1) admissibility conditions 
when genuine nonlinearity does not hold and 
(2) other approximations of shock wave problems, 
most notably kinetic formulations. 

Some of the ideas of shock wave theory have been 
applied to Hamilton-Jacobi equations and to 
motion by mean curvature, with applications to 
front propagation problems and “computer vision.” 


Stronger Singularities: Blow-Up Patterns 


The amplitude of a solution may also grow without 
bound. Examples include optical pulse propagation 
in Kerr media and singularities in general relativity. 
The phenomenon is common when reaction terms 
are allowed. As we now explain, this phenomenon is 
reducible to SH theory in many cases of interest. 
Blow-up singularities are usually not governed by 
the characteristic speeds defined by the principal 
part, because top-order derivatives are balanced by 
lower-order terms. In many applications, a systema- 
tic process (Fuchsian reduction) enables one to 
identify the correct model near blow-up; as a result, 


one can write the solution as the sum of a singular 
part, known in closed form, and a regular part. If 
the singularity locus is represented by t=O, the 
regular part solves a renormalized equation of the 
typical form 


¿Mu + Au = N [14] 


where Mu=0 is SH. Under natural conditions, for 
any initial condition z such that Az =Q, there is a 
unique solution of eqn [14] defined for small t. 

The upshot is an asymptotic representation of 
solutions which renders the same services as an 
exact solution, and is valid precisely where numeri- 
cal computation breaks down. 

Fuchsian reduction enables one in particular to 
study (1) the blow-up time; (2) how the singularity 
locus varies when Cauchy data, prescribed in the 
smooth region, are varied; and (3) expressions which 
remain finite at blow-up. It is the only known general 
procedure for constructing analytically singular 
spacetimes involving arbitrary functions, rather than 
arbitrary parameters, and is therefore relevant to the 
search for alternatives to the big bang. 


Examples and Applications 
Wave Equation with Variable Coefficients 


Consider the equation 
Onu + 2a! (x) Oru — a (x)ðpu = f (t, x,u, Vu) 


with (a!f) positive definite. Letting v-—(vo,..., 
Un+1):= (u, Oju, 0,u), we find the system 


Orvo = Un+1 
OU — OpUn+1 = 0 
ks ik 
Wn +2a OpVn+1 — a Op; = f 


It is symmetrizable, using the quadratic form 
oagutuB =v} + a vjvy + v2.4. 
One proves directly that, if v; = ð;vo for t=0, this 


relation remains true for all t. 


Maxwell’s Equations 


Maxwell’s equations may be split into six evolution 
equations: Q,E — curlB+7=0 and 9,B + curl E — 0, 
and two “constraints” div E — p=0,divB=0. The 
system of evolution equations is already in sym- 


metric form; the quadratic form o,gu“u® is here 
2 2 
|El? +|B[?. 
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Compressible Fluids 
Consider first the case of a polytropic gas: 


dvu+(v-V)v+p Vp =0 


| [15] 
Op + div(pv) = 0 
with p proportional to p”. Taking (p,v) as 
unknowns, one readily finds the SH system 
1 1 
—O;p +—(v-V)p+divv=0 16 
pee cue UP [16] 
pO,v + Vp + p(v: V)v=0 [17] 


Symmetrization for more general compressible 
fluids with dissipation, including bulk viscosity, so 
as to satisfy the additional condition [5] may be 
achieved if we take as thermodynamic variables p 
and T, and assume pressure p and internal energy £ 
satisfy Op/Op > 0 and 0e/0T > 0, by taking as 
unknowns (p, pv, ple + lo]? /2)). The specific entropy 
s satisfies de = Tds — pd(1/p). If the viscosity and 
heat conduction coefficients are positive, one finds 
that U= — ps is a convex entropy (in the sense of SH 
theory) on the set where p > 0, T > 0. 


Einstein's Equations 


The computation of solutions of Einstein's equations 
over long times, in particular in the study of 
coalescence of binary stars, has recently led to 
unexplained difficulties in the standard Arnowitt- 
Deser-Misner (ADM) formulation of the initial- 
value problem in general relativity. One way to 
tackle these difficulties is to rewrite the field 
equations in SH form; we focus on this particular 
aspect of recent research. 

Recall the problem: find a four-dimensional 
metric g,, with Lorentzian signature, such that 
Rab — 4 Rab =XTab, with, V^T;;, —0, combined 
with an equation of state if necessary. Rap is the 
Ricci tensor and R = g^^R,; is the scalar curvature; 
they depend on derivatives of the metric up to order 2. 
In addition to the metric, T, involves physical 
quantities such as fluid 4-velocity or an electro- 
magnetic field. The conservation laws of classical 
mathematical physics are all contained in the 
relation V*T b= 0. 

Now, the field equations cannot be solved for 
Ə? gab, and, as a consequence, the Taylor series of g4, 
with respect to time cannot be determined, even 
formally, from the values of g,; and 9g, for t=0 
(i.e., the Cauchy data). Furthermore, these data 
must satisfy four constraint equations. If the 
constraints are satisfied initially, they “propagate.” 
But in numerical computation, these constraints are 


never exactly satisfied, and the computed solution 
may deviate considerably from the exact solution. 
Also, numerical computations depend heavily on the 
way Einstein's equations are formulated. 

The simplest way to derive a SH system is to 
replace Rab by RY = Rap — 5 [SbcOaF® + gacOpF"), 
where phy. It turns out that R5, — 
—1 940.125 + Halg, Og), where the expression of 
Ha, is immaterial. Applying to each component of 
the metric the treatment of the first example above 
(wave equation with variable coefficients), one 
easily derives an SH system of 50 equations for 50 
unknowns: the ten independent components of the 
metric, and their 40 first-order derivatives. Now, if 
the I are initially zero (coordinates are “harmo- 
nic"), they remain so at later times. 

Unfortunately, the harmonic coordinate condition 
does not seem to be stable in the large. More recent 
formulations start with one of the standard setups 
(ADM formalism, conformal equations, tetrad 
formalism, Newman-Penrose formalism) and pro- 
ceed by adding combinations of the constraints to 
the equations, multiplied by parameters adjusted so 
as to ensure hyperbolicity or symmetric-hyperboli- 
city if needed. Another recent idea is to add a new 
unknown A which monitors the failure of the 
constraint equations; one adds to the equations a 
new relation of the form 9jA— aC — BA, where 
C — 0 is equivalent to the constraints, and a and 8 
are parameters. One then adds coupling terms to 
make the extended system SH. It is expected that the 
set of constraints acts as an attractor. 

Reported computations indicate that these meth- 
ods have resulted in an improvement of the time 
over which numerical computations are valid. 


Tricomi's Equation 


Let (x,y) solve (y0l — 0; )p = 0. Letting u= 
e (Oxy, 0yq), one finds a symmetric system Lu — 0, 


with 
fy 9 (0 1 
Los (1 1) x $ 5) 


«(11 


we find that K = ZL = A! Ó, + A*0, + B, where 


1 
_1¢9 Alaa a2y— (atray ^y 


is positive definite if y is bounded, of arbitrary sign, 
and A is small. 
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Cauchy-Kowalewska Systems 


Consider a complex system 


Qu = A'(z,t,u) = + B(z,t, u) [18] 
where u=(u4),z=(z',...,2”). The coefficients are 
analytic in their arguments when z and t are close to 
the origin and u is bounded by some constant K. 
The Cauchy-Kowalewska theorem ensures that, for 
any analytic initial condition near the origin, this 
system has a unique analytic solution near z- 0, 
even without any symmetry assumption on the A’. 
This result is a consequence of SH theory 
(Garabedian). 

Indeed, write z/ =x! + iy”, 0, =(1/2)(0,; — 10,,), and 
Oz, = (1/2)(9y + i0,;). Recall that analytic functions 
of z satisfy the Cauchy-Riemann equations 0,4 =Q. 

Adding (A/)! ô, to [18], and using the definition of 
ð; and z, we find the symmetric system 


u, = 5A + (A) 8n 


+A - (A) )üyu + B [19] 
Solving this system, we find a candidate u for a 
solution of eqn [18]. To show that z is analytic if the 
data are, we solve a second SH system for 
w =w :— 0; u. If the data are analytic, w vanishes 
initially, and therefore remains zero for all f. 
Therefore, u is indeed analytic. 


See also: Computational Methods in General Relativity: 
The Theory; Einstein Equations: Initial Value 
Formulation; Evolution Equations: Linear and Nonlinear; 
Magnetohydrodynamics; Partial Differential Equations: 
Some Examples; Semilinear Wave Equations; Shock 
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Introduction: Spacetime Symmetries 


Symmetries have played, and continue to play, an 
important role in fundamental physics, but the part 
they play is today seen as more complicated and 
many-sided than it was in the early days of particle 
physics, just after the Second World War. The area 
in which symmetries have had their most dramatic 
consequences is elementary particle physics, or 


Wave Refinement of the Friedman—Robertson—Walker 
Metric. 
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high-energy physics, and the majority of this article 
is concerned with this subject. The article concludes 
with some observations about symmetries and 
conservation laws in general relativity. 

In the early days, considerations of symmetry 
were almost limited to Lorentz transformations: we 
begin by reviewing this crucially important topic. 
Invariance of the laws of nature under translations 
in space and time are actually necessary for the 
existence of science itself; if experiments did not 
yield the same results today and tomorrow, and in 
Paris and Moscow and on the Moon, then in effect 
there would be no laws of nature. Almost as strong 


a statement could be made about invariance under 
rotations; if space were not isotropic, experimental 
results would depend on which direction the 
apparatus was aligned in, and again any laws 
would be extremely hard to find. Turning to the 
question of motion, Newton and Galileo realized 
that the laws of dynamics are the same in all inertial 
frames in relative motion. In the Newton-Galileo 
scheme, the rule for relating the space and time 
coordinates of two frames of reference is (for 
relative motion along the common x-axis) 


x’ =x — vt, =t [1] 


This principle of relativity was reaffirmed by 
Einstein, but with the crucial modification that the 
rules for relating coordinates in two frames are 
given by Lorentz transformations, so that [1] is 
replaced by 


/ 


x’ = «(x — vt), ?-(t-—) [2] 
Time is absolute in [1] but relative in [2]. Einstein 
was of course motivated by the fact that Maxwell's 
equations are covariant under Lorentz transforma- 
tions, but not under Newton-Galileo ones. 

The above considerations reveal that the laws of 
nature should be covariant under ten types of 
transformation: three translations in space, one in 
time, three parameters (angles) for rotations and 
three velocities. These transformations together 
form a group, the inhomogeneous Lorentz, or 
Poincaré group. It is a nonabelian group whose ten 
generators correspond to 4-momentum, angular 
momentum, and Lorentz boosts. The seminal work 
on the significance of this group in fundamental 
physics is that of Wigner in 1939. Assuming that the 
states of fundamental quantum systems (particles, 
atoms, molecules) form the basis states for repre- 
sentations of this group, these entities are described 
by two quantities, mass and spin. Spin, moreover, 
which was already familiar from earlier investiga- 
tions in quantum physics, was described by the 
rotation group (SU(2), which is homomorphic to 
SO(3)) only for states with timelike momentum. For 
photons, for example, with null momentum, spin is 
described by the (noncompact) Euclidean group in 
the plane, with the consequence that there are only 
two polarization states for this massless particle. 

Noether's theorem provides the crucial link 
between symmetries and conservation laws, via the 
principle of least action. Noether showed that the 
invariance of the action under a continuous 
symmetry operation implied the existence of a 
conserved quantity. The conserved quantities 
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corresponding to invariance under translation in 
space and time are momentum and energy; con- 
servation of angular momentum follows from 
invariance under rotations and invariance under 
Lorentz transformations gives rise to conservation 
of motion of the center of mass. 


Gauge Theories: Electromagnetism 
and Yang-Mills Theories 


A quantity whose conservation has been well known 
for a long time is electric charge. The question may 
then be asked: invariance under what symmetry 
gives rise to conservation of electric charge? A 
classical complex field has the Lagrangian density 


L = (0,9)(0"9*) — m^9* à [3] 
which is invariant under 
$ — exp(—iQ A)ó [4] 


A being the parameter for the transformation. 
Noether’s theorem then yields conservation of O, 
interpreted as electric charge. With A a constant, as 
above, the Lagrangian possesses a “global” symme- 
try. This becomes a “local” symmetry when A 
becomes space and time dependent, A(r,t) or 
A(x"). In that case, however, the Lagrangian [3] is 
no longer invariant under [4], because of the 
derivative terms. To preserve invariance an extra 
field A, must be introduced, so that [4] then 
becomes 


ó — exp(—iQ A(x^))ó 


1 [5] 
gon 


and the Lagrangian acquires extra terms, involving 
A,. The field A, is called a gauge field and is 
identified with the electromagnetic potential. The 
transformation [5] is called a gauge transformation, 
and since the phase factor exp(—iQ A) may be 
regarded as a unitary 1 x 1 matrix, we have here a 
theory with U(1) gauge invariance, which describes 
electromagnetism and conservation of charge. 

The notion of isospin had been introduced by 
Heisenberg in 1932. Isospin (then called isotopic 
spin) was a vector-like quantity conserved in strong 
(nuclear) interactions. Yang and Mills in 1954 made 
the pioneering suggestion that isospin conservation 
could also be recast as a gauge theory, by enlarging 
the U(1) group of electromagnetism to SU(2) 
(corresponding to rotations in “isospin space"), 
and at the same time treating the rotation angles as 
functions of spacetime. Then, eqn [4] will change: if 


Ag — Ay + 
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for example w is an isospinor field, then local 
isospin rotations are given by 


w(x) — exp{ —i5- 0(x) by(x) = U(x)w(x) [6 


where qt are the Pauli matrices: 1/2 are the generators 
of SU(2). The gauge field then has three components 
A, (t= 1,2, 3) which may be written as a matrix 

Ay, — A, F 


transforming as 
A, — A’, =U(x)A, U^! (x) 


--Q,U)U'« m 

8 
where g is the coupling constant, analogous to 
electric charge. The problem with this idea was that 
the isospin gauge field, analogous to the photon in 
electrodynamics, should, like the photon, be mass- 
less and have polarization states +1 (commonly, but 
inaccurately — see the work of Wigner (1939) — called 
spin 1); whereas the Yukawa particle, identified as the 
7 meson, was massive and had spin 0, so could not act 
as the isospin gauge field. 

The Yang-Mills idea really came into its own 
with the standard model (SM) of particle physics. 
This (gauge) model has an invariance group SU(2) & 
U(1) & SU(3), the first two groups corresponding to 
electroweak interactions (a unification of weak 
interactions and electromagnetism) and the final 
SU(3) to quantum chromodynamics (QCD), the 
gauge theory describing quark interactions, which 
“glues” them together to make hadrons — protons, 
neutrons, pions, etc. This model is a dramatically 
successful one. The QCD sector of the theory 
requires essentially no further elaboration on the 
Yang-Mills idea than replacing the group SU(2) by 
SU(3). This is a straightforward matter of replacing 
the generators t/2 of SU(2) with the eight generators 
(3 x 3 matrices) of SU(3). U(x) then also becomes a 
3 x 3 matrix. The three degrees of freedom are the 
three quark “colors,” for which there is good 
experimental evidence, and the gluons, the quanta 
of the gauge fields, are indeed massless and have 
good experimental support. In the electroweak 
sector, however, the gauge fields, the W and Z 
bosons, were found with the predicted masses of 
80.3 and 91.2 GeV respectively (the proton mass, for 
comparison, is 0.98 GeV). They are certainly not 
massless, as the straightforward Yang-Mills theory 
would require, and the explanation for this requires 
the introduction of the concept of spontaneous 
symmetry breaking. 


Spontaneous Symmetry Breaking 


The general idea of spontaneous symmetry breaking 
is that the vacuum — the state of lowest energy — is 
not invariant under the symmetry in question. A 
simple and common illustration is a pencil balanced 
vertically on its tip on a horizontal plane. The pencil 
is in unstable equilibrium but the system has a 
symmetry under rotations in the plane about the 
axis coincident with the pencil. Eventually, the 
pencil will fall into its lowest-energy state (vacuum), 
lying on the table in some direction — and the 
rotational symmetry is then lost. In fact, under 
rotations the actual lowest-energy (vacuum) state 
will be changed into another such state. There is a 
degenerate vacuum. 

A similar scenario may be constructed in a 
complex scalar field theory. Consider such a theory 
with a Lagrangian given by 


L = (0,0)(0"")—m'd'o— AGO) [8] 
that is, with a potential energy function given by 
Vlo, p) = mo" + X9'óy [9 


where 7 is the mass of the field (quantum) and A is the 
coupling of its self-interaction. The ground state is 
obtained by minimizing V, hence ƏV /O0ó — 0, giving 
(assuming that m? > 0) a minimum at ó— ó* — 0. 
If, however, m* < 0, there is a local maximum at 
@=0 and a minimum at |ól^ — —»2/2A » 0. In 
quantum theory language, the vacuum expectation 
value «0|ó|0» of the field is nonzero. Goldstone 
showed that this implied the presence of a massless 
scalar particle - a Goldstone boson. There was some 
interest in this result in particle physics, where the 
hypothesis of *partial conservation of the axial vector 
current" (PCAC) might result in a Goldstone boson 
that could be identified with the pion; although not 
massless, the pion is the lightest hadron, so “almost” 
massless. 

Higgs analyzed what happens to the Goldstone 
model if electromagnetism is included. The Lagran- 
gian [8] is invariant under the global transformation 
[4], but if this is made local, as in [5], a gauge field 
must be introduced and it is found that the massless 
Goldstone boson disappears and the massless gauge 
field (photon) becomes massive. Thus, spontaneous 
symmetry breaking of a gauge theory results in the 
appearance of a massive, rather than massless, gauge 
particle. (It is relevant to remark that a massless 
photon possesses two polarization states, but a 
massive one possesses three, so the number of spin- 
polarization states is preserved - the massless 
photon “eats” the Goldstone boson and becomes 
massive.) The Higgs model was generalized to the 


Symmetries and Conservation Laws 169 


case of a nonabelian symmetry group by Guralnik, 
Hagen, and Kibble and invoked by Weinberg in his 
1971 model for the electroweak interaction in which 
the gauge quanta were massive. 

Higgs’ work was motivated by the theory of 
superconductivity, where the Meissner effect (expul- 
sion of magnetic flux from a superconductor), when 
relativistic, implies that the effective mass of a 
photon in a superconductor is nonzero — this is, 
the “reason” that the flux does not penetrate. In the 
theory of Bardeen, Cooper, and Schrieffer (BCS), a 
superconductor is described by an effective scalar 
field, a composite of electron pairs (though paired in 
momentum space rather than coordinate space), and 
this provides a physical analogy with the model 
above. The SM of particle physics postulates a Higgs 
scalar field analogous to the BCS composite scalar 
field. If this field exists, Higgs particles should also 
exist, but they have not yet been found. This is an 
outstanding problem for the SM. 


Baryon and Lepton Numbers 


The fact that the proton p does not decay into 
positron plus photon, e* + y, or muon plus photon, 
u* +7, implies a conservation law of baryon 
number B (the proton possessing B=1 and the 
others B=0). Furthermore, the stability of jj and 
t against decay into e^ + y implies conservation of 
lepton numbers Le, L,, and L;. These are regarded 
as global, not local, symmetries, so there are no 
associated gauge fields or interactions. Interestingly, 
however, these symmetries are not built into the SM, 
so are not guaranteed by it. More interestingly, these 
symmetries are actually destroyed in one attempt to 
go beyond the SM. This is the hypothesis that QCD 
may be unified with electroweak interactions to 
produce a “grand unified” theory (GUT). The 
simplest GUT is the one in which the SU(2) & U(1) & 
SU(3) symmetry is assumed to be a subgroup of the 
much tighter symmetry SU(5), and in that theory the 
proton is unstable: 


TT. [10 


The predicted lifetime is 10?9*! years, while a recent 
estimate of the lifetime for this decay mode is > 
5 x 10% years. It may be that GUTs do not exist in 
nature, but since the decay [10] violates conserva- 
tion of the quantities B and L,, even entertaining the 
idea that the decay might take place begs the 
question, *are these conservation laws sacrosanct?" 

Another recent development which leads to the 
same question is the subject of neutrino oscillations. 
A strong motivation for this is the solar neutrino 


problem; this is the problem that the number of 
electron neutrinos detected on Earth, originating in 
the Sun, is less than the number predicted, by a 
factor close to 3. The mismatch could be at least 
partly, and perhaps completely, explained if electron 
neutrinos “oscillated” into muon and/or tau neutri- 
nos on their passage from the Sun to the Earth, since 
the reaction which detects the neutrinos on Earth is 
sensitive only to electron neutrinos, and not to the 
other species. But oscillation is only permitted if 
Le, Lu, and L, are not separately conserved quan- 
tities. Oscillation can also only take place if the 
masses of the different neutrinos are different — the 
oscillation rate depends on Am? - hence not all 
the neutrinos may be massless. 


Discrete Symmetries 


Ever since parity violation was discovered in weak 
interactions (nuclear beta decay) by Wu in 1957, the 
whole subject of discrete symmetries has presented 
problems which are still not resolved. The symme- 
tries in question are 


P (space inversion): (x,y,z) — (=x, —y, —z) 

T (time reversal): t ^ —t 

C (particle-antiparticle 
antiparticle 


conjugation): particle 


Are the laws of physics invariant under these 
operations? The Wu experiment revealed that weak 
interactions are not invariant under P, but what 
about other interactions and other operations? In 
this context, the CPT theorem is highly important. 
According to this theorem (based on very general 
assumptions), all laws of nature must be invariant 
under the combined operation CPT, so that, for 
example, the fact that weak interactions are not 
invariant under P means that they are not invariant 
under the product CT either. 

The violation of P invariance in beta decay was 
soon related to the fact that the neutrino involved 
(the electron neutrino — or, to be precise, antineu- 
trino) was massless. Spin-1/2 particles like the 
electron and neutrino obey the Dirac equation, 
which may be written out as a pair of coupled 
equations for left- and right-handed states. In the 
case 71 — 0, however, these equations decouple so it 
is possible to have a massless spin-1/2 particle which 
is either left-handed or right-handed. Any interac- 
tion involving this particle would automatically 
violate parity (which turns a left-handed state into 
a right-handed one). Experiments have verified that 
the neutrino is indeed left-handed. The SM incorpo- 
rates this in the sense that the left-handed electron 
e | and the electron neutrino ve are assigned to a 
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weak isospin SU(2) doublet, while the right-handed 
electron ep transforms as a singlet. A similar 
pattern is repeated for the y and t particles and 
their neutrinos. The phenomenon of neutrino oscil- 
lations, on the other hand, does not allow all the 
neutrino states also to be purely left-handed (since 
they cannot be massless). This poses a potential 
problem for the SM. 

For a few years after 1957 it was believed that beta 
decay violated C as well as P, but conserved the 
product CP; and indeed that all weak interactions 
were CP invariant. In 1964, however, it was found 
that there is a small element of CP violation in K? 
decay. CP-violating effects are also expected in B? 
decays. The physical origin of CP violation is still not 
understood, but its importance is that it implies T 
violation, so that in (at least some) weak interactions, 
there is an “arrow of time” on the subnuclear scale. 
(Such an arrow of time is, of course, familiar in 
thermodynamics.) This is used in a cosmological 
context to explain baryon-antibaryon asymmetry in 
the Universe. 


Baryon-Antibaryon Asymmetry 


In the standard model of cosmology it is shown that 
applying the known laws of physics to the early 
Universe (the first few minutes) leads to the 
conclusion that at an age of 226s nuclear fusion 
reactions took place resulting in a mixture of 74% 
protons and 26% a particles, so that, hundreds of 
thousands of years later, when galactic condensation 
took place, it would involve precisely this admixture 
of hydrogen and helium gases. Just this amount of 
helium has been found in the Sun, giving great 
confidence to the “big bang” model. Assuming that 
at extremely small times the baryon number of the 
Universe was zero, B=0, and assuming also (a big 
assumption, but one nevertheless made by cosmol- 
ogists) that the Universe is made of matter and not 
antimatter, we may then ask, why is this — where 
has the antimatter gone? 

Surprisingly, this question was addressed as early 
as 1966 by Sakharov, who showed that, starting 
with an initial state with B = 0, it would be possible 
to reach a state with B Z0 as long as three 
conditions obtained: B violating interactions, CP 
and C violating interactions, and lack of thermal 
equilibrium. GUTs and ordinary weak interactions 
already provide possibilities for the first two of these 
conditions. Breakdown of thermal equilibrium will 
be expected to occur as the Universe expands. 
When the particle density is high, reactions such as 
p+p— y+y will ensure an equal population of 
baryons and antibaryons, even in the presence of B 


violating interactions, but as the density increases 
and this reaction rate becomes less than the 
expansion rate, thermal equilibrium can no longer 
be maintained. Thus, GUTs offer an explanation of 
why there is no antimatter in the Universe. It might 
be thought that this sort of explanation is implau- 
sible, since the B-violating and CP-violating forces 
are so weak, but actually this is not a problem, since 
the ratio of baryon number to photon number in the 
Universe is of the order Ng/N., = 10°; so we may 
conjure up a scenario in which the B and CP 
violating forces give rise to a volume of space in 
which there are, say, 10? antibaryons, 107+ 1 
baryons and approximately the same number of 
photons. Then, all the antibaryons become annihi- 
lated leaving one baryon and 10? photons — as 
observed. 

A recent development in the area of discrete 
symmetries has been the suggestion by Kostelecky 
and coworkers that there might exist spontaneous 
violation of CPT and Lorentz symmetry. 


Topological Charges 


Conserved quantities of a quite different type have 
received a lot of attention in recent decades. Their 
conservation is a consequence of nontrivial bound- 
ary conditions for the fields. A famous example is 
the sine-Gordon “kink.” The sine-Gordon equation 

Po fd 1. 

=> — —— + —-sin(bd) = 0 11 

Of 0 b (99) tt] 
describes a scalar field in one space and one time 
dimension. It is a nonlinear equation which pos- 
sesses, among others, the interesting solution 


f(£) =F arctan expl-+(7/V/b)E 


where £ —x — vt and 4—(1— 12) !?, This corre- 
sponds to a solitary wave which moves, preserving 
its shape and size — in distinction to usual waves, 
which spread out and dissipate. Waves of this type 
are called solitons, and solitons have in fact been 
observed moving along canals. In this case, they are 
solutions to the Korteveg de Vries equation. Equa- 
tion [11] clearly possesses the constant solutions 

à 21m 

' ie b ) 
which, it may be shown, all have zero energy. We 
may then construct a solution of the above type, but 
with 2; — 0 as x — —oo and n=N as x — +00. This 
so-called *kink" solution has finite energy and is not 
continuously deformable into a solution with » — 0 
everywhere, since this would involve overcoming an 
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infinite energy barrier. The *kink number" may be 
characterized as a charge: defining the current 


J= LEY 
with £"" the totally antisymmetric symbol, it is clear 
that this is identically conserved, „J“ — 0. This is a 
consequence of the definition of &""; it is not a 
consequence of invariance of the sine-Gordon 
Lagrangian under a symmetry operation, so the 
current J” is not a Noether current. The associated 
conserved charge is 


o= [pa |Z ax 


Ox 
= 2^ [6(00) — d(-00)] = N 


Models of the above type may be written down in 
a spacetime with more than two dimensions. In that 
case the above solution depends only on one 
coordinate, so represents an infinite planar “domain 
wall,” on the two sides of which the field assumes 
different values. Such domain walls, as well as 
“cosmic strings,” are considered as serious possibi- 
lities in cosmology. 

Nonabelian gauge theories and the sigma model 
also provide a fertile ground for topological excita- 
tions — field configurations which for topological 
reasons do not decay. Gauge theories with sponta- 
neous symmetry breaking have two-dimensional 
solutions corresponding to vortex lines and three- 
dimensional solutions corresponding to magnetic 
monopoles. In spacetime (3 + 1 dimensions), there 
is a solution to the gauge field equations, with no 
spontaneous symmetry breaking, corresponding to 
an “instanton,” a finite-energy field configuration, 
localized in time as well as in space (hence the name). 
The gauge group here is SU(2), whose group space is 
S. Spacetime is “Euclideanized” into R^, whose 
boundary is then $?. Asymptotic field configurations 
may then be characterized by mappings of S? in field 
space into $? in parameter space, and since the third 
homotopy group of $? is nontrivial, 73(S°) = Z, these 
field configurations belong to different classes and 
are not deformable into each other. These define 
*degenerate vacua" of the gauge field equations. In 
quantum theory, tunneling between these vacua is 
allowed and 't Hooft has shown how this may give 
rise to deuteron decay d — e*t + v,. Other exam- 
ples of topologically nontrivial configurations are 
so-called sphalerons, which may also contribute to 
baryon number violation in the early Universe, and 
skyrmions, constructs in the nonlinear sigma model 
which serve as a model for baryon number. 


Supersymmetry 


Supersymmetry is a fermion-boson symmetry, pos- 
tulating that multiplets of fundamental particles 
contain both fermions and bosons. Thus, for 
example, since electrons exist there should also be 
“selectrons” — “scalar” electrons, with spin 0. There 
should also be photinos, with spin 1/2, to take their 
place alongside photons, and so on. If supersymme- 
try were exact, these particles would have the same 
mass as their partners and would have all been 
found, but in fact none have yet been discovered, so 
presumably supersymmetry is a broken symmetry. 
The feature that makes supersymmetry attractive is 
that it holds some promise for solving divergence 
problems in quantum field theory, since the radia- 
tive corrections from fermion and boson loops are 
opposite in sign and may exactly cancel. Super- 
symmetric models can also help to solve the 
so-called hierarchy problem in quantum field theory. 
If supersymmetry is made into a local symmetry, 
rather than simply a global one, extra fields must be 
introduced (as the photon field was introduced 
above), and it turns out that one of these is a spin-2 
field, which may be identified with the graviton. 
Local supersymmetry thus becomes supergravity. 


General Relativity 


Symmetries and conservation laws take on new aspects 
when general relativity is considered. Einstein’s field 
equations relate the energy-momentum tensor of 
matter (and radiation) to the Ricci tensor of spacetime. 
The Ricci tensor has vanishing covariant divergence, 
which means that the energy-momentum tensor 
possesses the same property, but conservation of 
energy and momentum requires that it is the ordinary 
derivative, not the covariant one, of this tensor that 
should vanish. It might be expected that this problem 
could be alleviated by including the contribution of the 
gravitational field itself in energy-momentum tensor. 
This is quite reasonable, but then problems of 
interpretation arise, since at any one point in a general 
spacetime, a coordinate system might be found which 
is inertial (this is the force of the equivalence principle), 
corresponding to no gravitational field, and therefore 
no energy. The usual procedure is to introduce an 
energy-momentum “pseudotensor,” and to conclude 
that energy in a gravitational field is not localizable. 
The role of symmetries in general relativity is rather 
different from its role in particle physics, which is set in 
Minkowski spacetime. In a general spacetime there are 
no symmetries, but many examples of particular 
spacetimes with their own symmetries are now 
known. The symmetry operations involved are 
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isometries, with corresponding groups of motion (so 
that the isometry group of Minkowski space is the 
Poincaré group). These groups are an important 
subject of study in cosmology; for example, there is a 
classification of homogeneous cosmological models, 
labeled according to the Bianchi classification. 


See also: Cotangent Bundle Reduction; Effective Field 
Theories; Electroweak Theory; General Relativity: 
Overview; Infinite-Dimensional Hamiltonian Systems; 
Noncommutative Geometry and the Standard Model; 
Quantum Field Theory: A Brief Introduction; 
Quasiperiodic Systems; Sine-Gordon Equation; 
Supergravity; Symmetries in Quantum Field Theory of 
Lower Spacetime dimensions; Symmetry and Symplectic 
Reduction; Symmetry Classes in Random Matrix Theory; 
Topological Defects and Their Homotopy Classification. 


Further Reading 


Aitchison IJ and Hey AJ (1981) Gauge Theories in Particle 
Physics. Bristol: Adam Hilger. 

Cheng T-P and Li L-F (1984) Gauge Theory of Elementary 
Particle Physics. Oxford: Clarendon Press. 


Dimensions 


J Mund, Universidade de Sao Paulo, Sao Paulo, Brazil 
_K-H Rehren, Universitat Gottingen, Gottingen, 

_ Germany 

- © 2006 Elsevier Ltd. All rights reserved. 


i yi era 


Symmetries in Quantum Field Theory 


Symmetries have proved to be one of the most 
powerful concepts in quantum theory, and in 
quantum field theory in particular. From the 
beginnings of quantum mechanics, it is well known 
that the presence of a symmetry allows one to 
predict relations between different measurements, to 
classify spectra (energy or other), and to understand 
the Pauli exclusion principle, to name only a few 
applications. Much more remarkably, in modern 
relativistic quantum field theory, designed to 
describe the interactions of elementary particles, 
fundamental interactions have been found to be 
induced by the principle of local gauge invariance. 

One distinguishes spacetime symmetries (Poincaré 
or conformal transformations), which change the 
position and orientation of the system in space and 
time, and internal symmetries, which preserve the 
localization, acting on certain internal degrees of 
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freedom. The Coleman-Mandula (1967) theorem 
states that internal and spacetime symmetries cannot 
be mixed, in the sense that the generators of internal 
symmetries must be Lorentz scalars, hence the total 
group of symmetries factorizes into a direct product. 
Supersymmetries are an exception of this theorem 
because their generators do not form a Lie algebra, 
and they were in fact designed to circumvent the 
Coleman-Mandula theorem. 

It is well known that the structure of symmetries 
of quantum systems in low-dimensional spacetime 
differs significantly from that in four-dimensional 
spacetime. (“Low” means in our context two or 
three, depending on the type of charge localization, 
c.f. below.) To name some examples: 


e Two-dimensional quantum systems may have much 
higher symmetries than four-dimensional ones: 

— In two dimensions, there exist massive integr- 
able models with infinitely many conservation 
laws and factorizable scattering matrices (see 
Integrability and Quantum Field Theory). 
These models exhibit solitonic superselection 
sectors, c.f. below. 

— The conformal group of two-dimensional 
spacetime is infinite dimensional, allowing for 
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the exact computation of correlation functions 
by the help of Ward identities (Belavin, 
Polyakov, and Zamolodchikov 1984). Only 
the finite-dimensional Móbius group, however, 
is also a symmetry of the vacuum state. 
Mobius covariance implies that the theory 
contains two subtheories of chiral fields 
defined on the light rays t —x=constant, 
resp. ^ + x —constant, and that these can be 
extended to fields defined on a circle, by 
adding a *point at infinity" to the light ray 
(Lüscher and Mack 1976). One arrives thus at 
one-dimensional chiral quantum field theories 
on a circle, which will play an important role 
in the discussion below. 

e Continuous symmetries cannot be spontaneously 
broken in two dimensions. The latter is true not 
only for relativistic quantum field theory (Cole- 
man 1973), but also in quantum statistical 
mechanics (Mermin and Wagner 1966) where 
it is responsible for the absence of ferromagnet- 
ism (see Symmetry Breaking in Field Theory). 
Spontaneous symmetry breakdown requires 
long-range order which is overcome by thermal 
fluctuations down to zero temperature, because 
these diverge logarithmically (in the thermody- 
namical limit) in two dimensions. This theorem 
thus illustrates how the spacetime dimension- 
dependent size of phase space has an effect on 
internal symmetries of quantum systems. A 
detailed mathematical analysis of the balance 
between phase space (thermal fluctuations) and 
long-range order (symmetry breakdown) has 
been given in a recent discussion of the Gold- 
stone theorem (Buchholz, Doplicher, Longo and 
Roberts 1992). 

e The Coleman-Mandula theorem, excluding a 
mixing between internal and spacetime symme- 
tries (see above), is valid only in higher 
dimensions. 


In more recent times, it has become apparent that 
low-dimensional quantum systems do not only 
admit more symmetries, but they may exhibit 
internal symmetries of an entirely new type, not 
describable by groups of transformations. In this 
article, we shall focus on the various ways in which 
the new symmetries can arise, and how they can be 
understood. In order to properly appreciate these 
issues, let us first recall some basic symmetry 
concepts in the conventional case. 

In the traditional setting, symmetries arise in the 
form of groups of transformations of the quantum 
system which leave observable quantities (e.g., 
vacuum expectation values and correlation 


functions) invariant. The symmetries form a group 
of *-automorphisms of the algebra of fields: 


ay (6102) = ay(d1)ay(p2) 
(a4())” = al) [1] 
Org, Ay, = Agg 


(typically given by linear transformations of field 
multiplets). In the strongest case, the automorphisms 
are implemented by unitary operators on the state 
space 


U(g)9U(g) = a,(¢) [2] 


The implementers form a representation of the 
group of automorphisms, 


U(gi)U(g3) = U(9192) [3] 


and there is an invariant vector state (a ground state, 
or the vacuum state in relativistic quantum field 
theory), 


U(g)2 = [4] 


However, depending on the dynamics of the 
quantum system, these relations cannot always be 
fully realized. One therefore considers several 
weaker or more general notions of symmetries 
relevant in four dimensions: 


e Spontaneously broken symmetries. The transfor- 
mations are given as automorphisms of an 
algebra, but which are not unitarily implemented 
in a given irreducible representation of the 
algebra. Invariant pure states do not exist. 

e Projective representations. The symmetries are 
unitarily implemented, but the implementers fail 
to satisfy the group law [3]. They give rise to ray 
(projective) representations or representations of a 
covering group. In particular, an invariant state 
vector as in [4] cannot exist in an irreducible 
representation. 

e Infinitesimal symmetries. Lie algebras of infinite- 
simal transformations, given as derivations of an 
algebra, which cannot be integrated to finite 
transformations. Derivations may or may not be 
implemented in a given representation of the algebra 
by commutators with self-adjoint generators. 

e Supersymmetry. The infinitesimal transforma- 
tions form a graded Lie algebra. 

e Local gauge symmetries form an infinite- 
dimensional group which are, however, not 
realized as automorphisms of the quantum alge- 
bra. Quantization of classical gauge interactions 
usually proceeds by breaking the gauge invariance 
in some way and restoring it at a later stage. 
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The Connection between Symmetry 
and Superselection Sectors 


It is often convenient to describe a model in terms of 
localized fields which do not represent an observable 
(in the sense of quantum mechanics that an operator 
corresponds to some measurement prescription). For 
example, Fermi fields which violate the principle of 
causality because they anticommute with each other 
at spacelike distance rather than commute are not 
observables. Only fields which are quadratic in the 
Fermi fields (densities of charge, current, energy) are 
observables. This means that an internal symmetry 
is used in order to distinguish the observables as 
those operators which are invariant under the 
symmetry: in the example, the symmetry transfor- 
mation multiplies each Fermi field by —1 (by the 
spin-statistics theorem, this transformation coincides 
with the univalence of the Lorentz group). We 
characterize this situation by writing 


A(O) = F(O)* [5] 


where A(O) and F(O) stand for the algebras of 
observables and fields localized in some spacetime 
region O, respectively, G is the internal symmetry 
group acting by automorphisms on each F(O) 
without affecting the localization, and F(O)° = {a € 
F(O), o (a) =a for all g € G} denotes the subalgebra 
of invariants. The internal symmetry group G which 
distinguishes the observables according to [5] is 
usually called the “(global) gauge group.” 

If the gauge symmetry G is unbroken in the 
vacuum state, then there is a well-known connec- 
tion between symmetry and superselection rules 
(see Symmetries and Conservation Laws): namely, 
the observables act reducibly on the vacuum 
Hilbert space representation of F because they 
commute with the unitary operators which imple- 
ment the symmetry (or with their infinitesimal 
generators, usually called charges). As a conse- 
quence, the validity of the superposition principle is 
restricted because two eigenstates of different 
eigenvalues of the charges cannot exhibit interfer- 
ence. In other words, they belong to different 
superselection sectors. Wick, Wightman, and 
Wigner (1952) were the first to point out this 
relation. We therefore call this scenario the “WWW 
scenario” for brevity. 

In the WWW scenario, the decomposition of the 
Hilbert space is determined by the central decom- 
position of the internal symmetry group (the 
eigenvalues of the Casimir operators). In this way, 
the superselection sectors are in one-to-one corre- 
spondence with the irreducible representations of 
the internal symmetry group. 


Superselection sectors of two-dimensional models 
do not follow this scheme expected by the WWW 
scenario (see below). This was most strikingly 
demonstrated through the classification of the 
unitary highest-weight representations of the 
Virasoro algebra (Friedan, Qiu, and Shenker) 
which is nothing other than the classification of the 
superselection sectors of the observable algebra 
generated by the chiral stress-energy tensor, and 
through the determination of their fusion rules by 
Belavin, Polyakov, and Zamolodchikov (1984). 

In two dimensions, one is therefore lacking a 
compelling a priori ansatz, like the WWW scenario, 
for describing the system in terms of auxiliary 
nonobservable charged fields. At this point, one 
may argue that from an operational point of view, a 
quantum field theory, and in particular its symme- 
tries, should be understood entirely in terms of its 
observables. (This viewpoint is emphasized in the 
algebraic approach to QFT, see Algebraic Approach 
to Quantum Field Theory.) We shall therefore now 
ask the opposite question: suppose we are given an 
algebra A of local observables (without knowledge 
of a field algebra and its gauge group). We define 
the superselection sectors intrinsically as (the unitary 
equivalence classes of) the positive-energy represen- 
tations of A. Then the question is: do these sectors 
arise through a WWW scenario from some field 
algebra and a gauge symmetry, and if so, can the 
latter be reconstructed from the given observables 
alone? 

The answer in four dimensions is positive, thanks 
to a deep result due to Doplicher and Roberts 
(1990). Let us sketch the line of reasoning leading to 
this result in some detail, because it shows how the 
connection between (global) gauge symmetry on the 
one hand and spacetime geometry on the other hand 
emerges through the principle of causality (locality) 
of relativistic quantum field theory, and because it 
makes apparent what is different in low-dimensional 
spacetime. 

The analysis is based on the general structure 
theory of superselection sectors due to Doplicher, 
Haag, and Roberts (DHR, 1971). The latter starts 
with a selection criterion invoking the concept of a 
localized charge: a superselection sector which by 
measurements within the causal complement of 
some spacetime region O cannot be distinguished 
from the vacuum sector. The heuristic idea is, of 
course, that the sector is obtained from the vacuum 
sector by placing some charge in the region O (e.g., 
by the application of a localized charged field 
operator to the vacuum vector). 

It has been shown (Buchholz and Fredenhagen 
1982) that  positive-energy representations of 
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massive theories always satisfy this selection criter- 
ion with a localization region O of the form of a 
narrow cone extending in spacelike direction. (In 
massless theories with long-range interactions, such 
as QED, the situation is more complicated because 
the charge creates an electric field whose flux at 
infinity does not vanish (Gauss’ law) and is not 
Lorentz invariant.) DHR assume that the localiza- 
tion region is even compact, and can be chosen 
arbitrarily within the unitary equivalence class of the 
representation. 

Exploiting a strong version of locality (Haag 
duality) for the vacuum representation of the 
observables, DHR proceed to define an associative 
composition (or fusion) law for positive-energy 
representations. This law is commutative only up 
to unitary equivalence. The crucial point is that the 
unitary intertwiner establishing this equivalence (the 
statistics operator) can be chosen in a unique way 
provided any pair of spacelike disconnected locali- 
zation regions can be continuously deformed into 
any other such pair. 

This point marks the separation between high and 
low dimensions. In two dimensions, in each pair of 
spacelike disconnected regions, one region is to the 
left of the other, thus distinguishing the pair 
(O1, O5) from (O5, O1). Consequently, they cannot 
be deformed into each other, and there arise two 
statistics operators. The same holds in three dimen- 
sions when the localization regions are spacelike 
cones, and O4,O05 are taken within (the causal 
complement of) some larger spacelike cone. If the 
spacetime dimension is at least 4, or if in three 
dimensions the localization regions are compact, 
then the statistics operator is unique and, as a 
consequence, coincides with its inverse. 

The (non-)uniqueness of the statistics operator has 
far-Éreaching consequences concerning our original 
question about the underlying gauge symmetry. 
Namely, the DHR analysis proceeds to show that 
the set of positive-energy representations equipped 
with the composition law, and the linear spaces of 
inertwiners between different representations, 


together form the mathematical structure of a C*' 


tensor category. The statistics operators which are 
distinguished intertwiners give additional structure 
to this category: this structure is called a (permuta- 
tion) symmetry if the statistics operators coincide 
with their inverse, and it is called a braiding 
otherwise. (It gives rise to a representation of the 
permutation group or the braid group, respectively.) 
In other words, the spacetime topology, through the 
intervention of the uniqueness of the statistics 
operator, causes the tensor category to be symmetric 
in high dimensions, and braided in low dimensions. 


At a more elementary level, one may think of 
statistics operators as reflecting commutation rela- 
tions between the searched-for charged fields. Mak- 
ing an ansatz for the commutation relations at 
spacelike separation, essentially the same topological 
argument as before implies, together with Poincaré 
invariance, that the coefficients appearing in this 
relation should form a representation of the permu- 
tation group, or of the braid group, respectively. The 
DHR approach, however, is entirely intrinsic, 
avoiding any a priori assumption of charged fields. 

The duality theorem due to Doplicher and 
Roberts (1990) now states that every symmetric C* 
tensor category (with some further qualifications 
valid in the DHR setting) is isomorphic to the 
category of unitary representations of a compact 
group, in which the composition law is the tensor 
product and the (permutation) symmetry is the 
natural one. Moreover, the category uniquely 
determines the group, and by a crossed product 
construction (an action of the category on the 
algebra A) one reconstructs a field algebra F such 
that [5] holds. If fermionic sectors are present, then 
there is some arbitrariness in the commutation 
relations among the corresponding fermionic fields, 
which can be exploited to produce the normal 
commutation relations (fermionic fields anticom- 
mute among each other, and bosonic fields commute 
with any field at spacelike separation). This fixes the 
field algebra F up to unitary equivalence. The 
conclusion is that the WWW scenario is the most 
general in four dimensions (apart from the reserva- 
tions due to long-range forces, see above). 


Generalized Symmetries in Low 
Dimensions 


In view of the success of this program in four 
dimensions and the advantage of the WWW 
scenario for model building, the obvious challenge 
is to search for an analogous understanding of 
superselection sectors (charges) in low dimensions in 
terms of an algebra of charged fields and a gauge 
symmetry distinguishing the observables. This gauge 
symmetry cannot, in general, be a group for several 
reasons: 


e As stated before, the tensor category of super- 
selection sectors possesses only a braiding, rather 
than a (permutation) symmetry, hence the duality 
theorem fails. 

e One can associate a (statistical) dimension d, to 
each superselection sector |r] which is multi- 
plicative under the composition law (fusion), and 
additive under direct sums. In a symmetric 
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category, the dimensions are necessarily positive 
integers. Indeed, in the WWW scenario, they 
coincide with the naive dimension of the asso- 
ciated representation of the gauge group. But in 
the low-dimensional models, the dimensions turn 
out to be nonintegers in general. 

e Moore and Seiberg (1988) have axiomatized the 
superselection structure of chiral and two- 
dimensional conformal field theories in terms 
of a system of recoupling and braiding coeffi- 
cients controlling the fusion of sectors and its 
noncommutativity. (In fact, this system is 
basically equivalent to the DHR category.) For 
models such as SU(2) current algebras at level 
k, these coefficients turn out to coincide with 
the recoupling and braiding coefficients one can 
associate with a quantum group deformation 
(Drinfe"d 1986) of SU(2) with deformation 
parameter q= —expiz/k. Representations of 
quantum groups (quasitriangular Hopf algebras, 
see Hopf Algebras and q-Deformation Quantum 
Groups) have a tensor product defined in terms 
of a noncocommutative coproduct. Moreover, 
they possess a quantum dimension which is a 
q-deformation of an integer. The quantum 
dimensions precisely match the statistical dimen- 
sions of the superselection sectors. All this 
strongly suggests that quantum groups appear as 
generalized symmetries in two dimensions, at 
least in a large class of models. 


A natural testing ground for the search for 
appropriate generalized symmetry concepts in low 
dimensions is the abundance of models in chiral and 
two-dimensional conformal QFT (see Two- 
Dimensional Models). As mentioned before, confor- 
mal symmetry in two dimensions has far-reaching 
consequences, especially the existence of chiral quan- 
tum fields which are defined on a one-dimensional 
light ray. As a null direction in the two-dimensional 
spacetime, this ray unites both the spacelike property 
of carrying a causal structure, and the timelike 
property that the generator of translations has positive 
spectrum (energy). These two features together with 
Möbius covariance are so powerful that they allow for 
the exact construction of large classes of models. The 
most elementary ones (minimal models) are 
completely described by the chiral stress-energy 
density field, that is, the local generator of the 
conformal symmetry. Other models also contain 
currents which are the local generators of internal 
symmetries. These models exhibit many nontrivial 
superselection structures, which illustrate the wide 
range of possible deviations from higher-dimensional 
QFT, and at the same time exhibit possible 


approaches to appropriate symmetry concepts in 
low dimensions. 

Attempts to classify the possible algebraic struc- 
tures of generalized internal symmetries in a model- 
independent setting start from the idea that the 
representation category of the internal symmetries of 
a given model should be equivalent to the tensor 
category of its superselection sectors. Several alge- 
braic structures have been proposed as candidates, 
complying with this idea. They all assume specific 
modifications or deformations of eqns [1]-[5] above, 
highly constrained by self-consistency. Among these 
proposals are: 


® quantum groups (see e.g., Frohlich and Kerler 
1993), 

e weak quasiquantum groups (Mack and Schomerus 
1992) and rational Hopf algebras (Fuchs et al. 
1994), 

e weak C* Hopf algebras (Rehren 1997, Bóhm and 
Szlachányi 1996) or quantum groupoids (Nik- 
shych and Vainerman 1998), and 

e braided groups (Majid 1991). 


In several cases, the respective “symmetry alge- 
bra” can be reconstructed from the tensor category 
of superselection sectors, and a field algebra with 
linear transformation behavior can be constructed 
which contains the observables as invariant ele- 
ments as in [5]. However, the situation is unsatis- 
factory for various reasons. First, the class of QFT 
models for which these constructions have been 
performed is quite restricted (most constructions 
work only for rational models, i.e., models with a 
finite set of charges); second, the reconstructed 
symmetry algebra is not unique and finally, the 
constructed field algebras have features which 
diverge significantly from the WWW scenario. For 
example, it is not always warranted that the 
quantum symmetries are consistent with the 
*-structure, indispensable for Hilbert space positiv- 
ity (a necessary prerequisite for the probability 
interpretation of quantum theory). Moreover, typi- 
cally there are global gauge transformations which 
are implemented by localized field operators, thus 
exhibiting a mixing of local and global concepts. It 
also happens that this holds for elements in the 
center of the symmetry algebra, which implies that 
the field algebra is not local relative to its gauge 
invariant elements, that is, the charged fields do not 
commute with the gauge-invariant elements at 
spacelike separation. In other constructions, the 
field algebra is not associative, or there are no finite 
field multiplets. 

Historically, the first candidate for a “symmetry 
algebra” compatible with braid group statistics has 


Symmetries in Quantum Field Theory of Lower Spacetime Dimensions 177 


been the structure of a quantum group, as men- 
tioned above. However, in physically interesting 
models, the quantum group is not semisimple and 
thus has too many (namely, indecomposable) repre- 
sentations. Solutions to this problem have been: 


1. A BRS approach in an indefinite-metric frame- 
work (Hadjiivanov et al. 1991), 

2. “Truncation,” that is, discarding the “unphysi- 
cal” representations. Frohlich and Kerler (1993) 
have done this consistently in a categorical 
framework. In fact, they have given a complete 
classification of the possible braided tensor 
categories generated by a single irreducible object 
with statistical dimension d satisfying 1 < d < 2, 
in terms of categories constructed from the 
“truncated” representations of U,(sl;). Trunca- 
tion can also be performed by dividing the 
quantum group itself through the ideal which is 
annihilated by all “physical” representations, 
leading to a weak quasiquantum group (Mack 
and Schomerus 1992). 

3. Relaxing the axioms, thus admitting the more 
general structures mentioned above. 


All the above approaches assume a given general- 
ized symmetry concept and show to what extent 
field algebras complying with it can be constructed. 
They thus concern nonobservable objects, and it is 
no contradiction if different symmetry concepts can 
be associated with the same observable data. 

A more radical concept of global gauge symmetry, 
applicable to the low-dimensional case, has been 
developed by Longo and Rehren (1995). Its point of 
departure is the notion of a conditional expectation, 
which has the same abstract properties as a group 
average. In the WWW scenario, the Haar measure 
of the compact gauge group defines an average 


p:F2$6— f auto Alo) € A [3 


which is a positive linear map respecting the 
localization, and the observables are invariant, 
p(a)=a. In fact, the observables are exactly the 
image of this map, that is, [5] is equivalently 
formulated, but without reference to the group 
transformations, as 


A(O) = u(F(O)) 7] 


Turning to the observables A of a quantum field 
theory in low dimensions, one looks for a quantum 
field theory F, containing A and equipped with a 
conditional expectation y such that [7] holds, and 
which preserves the vaccum state. F may not satisfy 
local commutativity, but it should be local relative 


to the observables in the sense mentioned before. In 
rational chiral CFT, such extensions can be classi- 
fied (and indeed constructed) in terms of the super- 
selection category of A, giving direct access to the 
decomposition of the vacuum Hilbert space of F into 
superselection sectors of A. The advantage here is 
that no problems with Hilbert space structure can 
arise (because the approach is entirely in terms of 
operator algebras); a drawback is that in general F is 
not unique, and nonvacuum representations of F 
also have to be considered in order to generate all 
sectors of A. 

The method can be used to classify and construct 
both nonlocal chiral extensions as candidates for 
sector-generating field algebras for a theory A of 
chiral observables, and local two-dimensional quan- 
tum field theories containing two given chiral 
subtheories, that is, observable algebras of two- 
dimensional models (Kawahigashi and Longo 2004). 
The chiral sector structure of the latter models is 
described by a “modular invariant.” In many cases, 
this means that their thermal partition functions are 
invariant under the group PSL(2,7) of modular 
transformations of the temperature (see below). 

At this point, another link between spacetime and 
internal symmetries may be noted. The modular 
theory of von Neumann algebras (see Tomita- 
Takesaki Modular Theory) associates a one-para- 
meter group of automorphisms (called the *modular 
group") with a state and an algebra *in standard 
position." In quantum field theory, for the vacuum 
state and an algebra of observables localized in 
certain wedge regions of Minkowski spacetime, this 
group can be identified with a boost subgroup of the 
Lorentz group (Bisognano and Wichmann 1975). 
Similarly, in chiral CFT on the circle, the modular 
group associated with the observables in an interval 
and the vacuum coincides with a subgroup of the 
Mobius group. For nonlocal theories, there may be 
an obstruction, however. On the other hand, if a 
subalgebra is stable under the modular group of 
some algebra, then there is a conditional expectation 
from the larger algebra onto the smaller algebra. 
Combining these general theorems, the Moobius 
covariance of the inclusions A(O) C F(O) implies 
the existence of a conditional expectation, that is, 
the above generalization of the average over the 
internal symmetry. Moreover, assuming a general- 
ized notion of compactness (“finite index") for the 
generalized internal symmetry, the Bisognano-Wich- 
mann property holds also for nonlocal theories 
(Longo and Rehren 2004). 

Of course, there is also a WWW scenario in chiral 
theories, that is, one may restrict a local theory to its 
invariants under some group of internal gauge 
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symmetries (“orbifold models”). It then happens 
that the invariants not only have the expected 
superselection sectors in correspondence with the 
representations of the gauge group, but in addition 
“twisted” sectors appear which, together with the 
former, constitute a “quantum double” structure. 
The twisted sectors arise by restriction of solitonic 
sectors of the original theory, which are in one-to-one 
correspondence with the elements of the gauge 
group (Müger 2005). Solitonic sectors are localiz- 
able with respect to two different vacua, and do 
not admit an unrestricted composition law. 


Special Issues 


A particularly simple situation is the case of anyons, 
that is, when all sectors have statistical dimension 1. 
Then the sectors form an abelian group G under 
fusion, and one can construct a WWW scenario with 
global gauge group G the dual of G. The ensuing 
quantum fields satisfy generalized commutation rela- 
tions at spacelike separation, given by an abelian 
representation of the braid group, where the coeffi- 
cients can be arbitrary complex phases (responsible 
for the name “anyons”). However, it is known that 
there can arise an obstruction, which enforces the 
“local” global gauge transformations (mentioned 
before) to be present. In this case, the gauge 
symmetry can also be described by a quasiquantum 
group. It is noteworthy that free anyon fields have 
been constructed in two-dimensional spacetime, 
while in three dimensions there can be no (cone-) 
localized massive anyon fields which are free in the 
sense that they generate only single-particle states 
from the vacuum (Mund 1998). 

The charge structure of massive quantum field 
theories in two dimensions is very different both 
from that encountered in conformal quantum field 
theories, and from the charge structure in high 
dimensions. It has been observed long ago that, in 
contrast to four dimensions, the strong locality 
property (Haag duality) which is necessary to set 
up the DHR analysis of superselection sectors, fails 
for the algebra of invariants under an internal gauge 
group in two dimensions. This algebraic feature can 
be traced back to the fact that the causal comple- 
ment of a point is disconnected in two dimensions, 
or, in physical terms, that *a charge cannot be 
transported around a detector" without passing 
through its region of causal dependence. Müger 
(1998) has shown that any algebra of observables 
which satisfies Haag duality, cannot possess any 
nontrivial DHR superselection sectors at all, and 
that the only sectors which can exist are solitonic 


sectors. This general result nicely complies with the 
experience with integrable models, as mentioned 
before. 

There are also some results giving interesting 
insight, which can be obtained intrinsically in terms 
of the observables. One of them concerns “central” 
observables (generalized Casimir operators). 

Casimir operators in the WWW scenario are 
functions of the generators of the internal symmetry 
which usually are integrals over densities belonging 
to the field algebra F (Noether’s theorem). Since 
they also commute with the generators, they can be 
approximated by local observables, and are there- 
fore defined in each representation of the latter. By 
Schur’s lemma, they are multiples of the identity in 
each irreducible sector. Since the eigenvalues of 
Casimir operators distinguish the representations of 
the gauge group, they also distinguish the sectors. 

In chiral CFT extended to the circle (see above), 
one can find global “charge measuring operators” 
C;, one for each sector z;, in the center of the 
observable algebra (Fredenhagen et al. 1992) which 
have similar properties. They arise as a consequence 
of an algebraic obstruction to define the charged 
sectors on the circle, related to a nontrivial effect if a 
charge is “transported once around the circle,” and 
form an operator representation of the fusion rules 
within the global algebra of observables. Under 
rather natural conditions clarified by Kawahigashi, 
Longo, and Müger (2001), the matrix of eigenvalues 
m(C;) is nondegenerate, that is, the generalized 
Casimir operators completely distinguish the super- 
selection sectors. In this case, the superselection 
category is a modular category (see Braided and 
Modular Tensor Categories): the matrix with entries 
d,,.7j(C;) and the diagonal matrix with entries z;(U) 
(where U is the Mobius rotation by 27) are multi- 
ples of the generators $ and T of the “modular 
group" PSL(2, Z), in a matrix representation labeled 
by the superselection sectors of the chiral observa- 
bles. The physical significance of this matrix 
representation is that it relates thermal expectation 
values for different values of the temperature (Cardy 
1986, Kac and Peterson 1984, Verlinde 1988) 

These examples, together with the failure of the 
Coleman-Mandula theorem, may illustrate the 
intricate relations among spacetime geometry, cov- 
ariance, and internal symmetry (charge structure) in 
low dimensions. In relativistic quantum field theory, 
the link is provided by the principle of locality, 
which *turns geometry into algebra." 


See also: Algebraic Approach to Quantum Field Theory; 
Axiomatic Quantum Field Theory; Braided and Modular 
Tensor Categories; Hopf Algebras and g-Deformation 
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Quantum Groups; Integrability and Quantum Field 
Theory; Quantum Field Theory: A Brief Introduction; 
Quantum Fields with Topological Defects; Symmetries 
and Conservation Laws; Symmetries in Quantum Field 
Theory: Algebraic Aspects; Symmetry Breaking in Field 
Theory; Tomita-Takesaki Modular Theory; 
Two-Dimensional Conformal Field Theory and Vertex 
Operator Algebras; Two-Dimensional Models. 
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Introduction 


This article treats the most important results and 
concepts relating to symmetry and conservation 
laws in quantum field theory. It includes such results 
as Wigner's theorem, Goldstone's theorem, the 
Bisognano-Wichmann theorem, the quantum 
Noether theorem, and the theorem on the existence 
of gauge groups and a field net. It is written within 
the framework of algebraic quantum field theory, 
this being the simplest setting capable of expressing 
all these concepts and results. 

Symmetries come in many guises. They are to a 
physical system what automorphisms are to a 
mathematical theory. In fact, when a_ physical 
system is described in mathematical terms, its 
symmetries correspond to the automorphisms of 
the mathematical structure and in particular form a 
group, its symmetry group. The reader should bear 
in mind this simple picture throughout its diverse 


variations. Readers unfamiliar with the mathemati- 
cal terminology should consult the appendix. 


Elementary Quantum Mechanics 


Before turning to quantum field theory, let us 
comment on symmetries in elementary quantum 
mechanics. These systems have the density matrices, 
that is, positive operators of trace 1, on an infinite- 
dimensional separable Hilbert space as states, the 
self-adjoint operators as observables. The expecta- 
tion value of the bounded observable A in the state 
determined by p is given by tr pA. Having specified 
the mathematical structure, the notion of symmetry 
follows. With a suggestive notation, it is a pair of 
mappings A +> aA, p > pa! such that 


trpa 'aA = tr pA 


for all observables A and states p. 

If we take p and A to be the projections onto Có 
and Cw for unit vectors ó and v, then the above 
condition corresponds to the conservation of 
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transition probabilities |(¢, wy. This formed the 
starting point for Wigner’s analysis, who concluded: 


Theorem Every symmetry is of the form A> 
UAU! and p => UpU™", where U is a unitary or 
antiunitary operator. 


As could have been foreseen from the outset, this 
simple result in no way distinguishes one elementary 
quantum-mechanical system from another. A more 
useful notion of symmetry results if the Hamiltonian 
is reckoned as part of the information describing the 
system and, therefore, has to be left invariant by a 
symmetry. The operator U above must therefore 
satisfy the condition UHU! =H and it commutes 
with the Hamiltonian. As the Hamiltonian is the 
generator of time translations, U is a constant of 
motion. This is the genesis of the relation between 
symmetries and conservation laws. 


Quantum Field Theories 


The simplest types of quantum field theories can be 
described by von Neumann algebras A(O) depend- 
ing on double cones O and subject to 


Qi CO, => A(O) C (Oz) 


a structure referred to as the net of observables. 

An alternative approach would be to use the 
Wightman formalism. This would need a discussion 
of pointlike fields and the domains of definition of 
unbounded operators, thus complicating a general 
exposition of symmetry. 

Comparing this description of a quantum field 
theory with that of an elementary quantum- 
mechanical system, the net clearly substitutes obser- 
vables but nothing has yet been said about states. 
Since the set of double cones is directed under 
inclusion, the union of the A(O) is a *-algebra A and 
a state of our system is a state on this algebra. 

Most states are of no physical relevance. A 
characterization of the states of physical relevance, 
even say to elementary particle physics, is not 
known although some progress has been made. 

The net structure is the hallmark of a field theory 
and allows us to distinguish two important classes of 
symmetries. An internal symmetry a satisfies the 
condition 

a(9t(O)) = A(0) 
for all double cones O. By contrast, a spacetime 
symmetry is an automorphism «a, implementing a 


Poincaré transformation L and hence satisfying the 
condition 


ar (2(O)) = ALO) 


for every double cone O. It is usually the case that 
internal symmetries commute with spacetime 
symmetries. 

The state of prime relevance to elementary particle 
physics is the vacuum state wo. The corresponding 
Gelfand-Naimark-Segal (GNS) representation Tp is 
called the vacuum representation. Now the vacuum 
state of a quantum field theory is typically unique 
and as such invariant under a symmetry of the system 


uoa = Qo. 


Spacetime Symmetries 


Since the vacuum state is invariant, we have a 
unitary representation of the Poincaré group imple- 
menting the spacetime symmetries in the vacuum 
representation. To illustrate the role of representa- 
tions up to a factor, we take instead the GNS 
representation of a pure state corresponding to a 
particle of half-integral spin. Here we need a unitary 
representation of the covering group of the Poincaré 
group, inhomogeneous SL(2, C) to implement the 
symmetries. The situation. for the subgroup of 
rotations is the same. 

The most important property of these representa- 
tions is positivity of the energy. More precisely, in a 
representation of relevance to elementary particle 
physics such as the vacuum representation, the 
generator P? of time translations is a positive 
operator P? > 0. Expressed in a frame-independent 
way, the spectrum of spacetime translations is 
contained in the closed forward light cone. It is 
one of the basic principles to be exploited in 
applying quantum field theory to elementary particle 
physics. Notice that the principle is no longer valid 
for an equilibrium state. 

A similar situation arises in conformal field 
theory. Here the role of double cones in Minkowski 
space is played by intervals on the circle and that of 
the Poincaré group by the Möbius group on the 
circle PSL(2, R). Again, the Móbius group cannot 
always be unitarily implemented and conformal 
invariance is defined via a continuous unitary 
representation of its covering group. Most impor- 
tantly, there is an analog of positivity of the energy. 
The generator of rotations of the circle is a positive 
operator. 

A remarkable aspect of spacetime symmetries was 
discovered by Bisognano and Wichmann in an 
application of modular theory in the field-theoretical 
context looking not at double cones but at wedges. 
A wedge W is a Poincaré transform of the standard 
wedge x! >|x°|. They found that the modular 
automorphisms of A(W) and the vacuum vector (2o 
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have a geometric significance. For the standard 
wedge, they got the following result. 


Theorem If the net is derived from Wightman 
fields, the modular operator is e*, where K is the 
generator of boosts in the 1-direction and the 
modular conjugation is ZRO, where © is the TCP- 
operator, R is the rotation through x about the 
l-axis, and Z is the unitary operator equal to 1 on 
the Bose subspace and —i on the Fermi subspace. 


The modular data for (O) and Qo also admit a 
geometric interpretation for the free massless scalar 
field. 

These facts enhance our understanding of space- 
time symmetries. The ideas have meanwhile been 
applied to curved spacetime to select a state with 
vacuum-like properties using the principle of the 
geometric action of the modular conjugation. 


Gauge Symmetry 


Gauge symmetries do not fit into our scheme in that 
they act trivially on the observable algebra A. To 
exhibit a gauge symmetry we need a larger net $ 
called the field net. The gauge group will be the 
group of automorphisms of ¿y leaving the subnet Y 
pointwise fixed and Y the subnet of of fixed 
points under G. This has the merit of indicating the 
mathematical framework for gauge symmetry but 
otherwise begs important questions. A priori one 
does not know what properties ¿$ should have nor 
how it should be constructed. 

The right approach is to understand what intrinsic 
structure of X governs the existence of a nontrivial 
gauge group. This brings us back to the states or 
representations relevant to elementary particle phy- 
sics. A condition for selecting some of these relevant 
representations is that asymptotically they be like 
the vacuum in spacelike directions. More precisely, 
7 must be unitarily equivalent to the vacuum 
representation y on the spacelike complement of 
every double cone. 

The resulting theory of superselection sectors 
hinges on the property of Haag duality that, for 
each double cone Ó, 


AO) = A'Y 


where ©’ denotes the spacelike complement of Ó. It 
implies that every representation satisfying the 
selection criterion is unitarily equivalent to one of 
the form rop, where p is an endomorphism of YA 
localized in some fixed but arbitrary double cone, 
that is, p(A)=A if A € Y(O’). The endomorphisms 
thus obtained are closed under composition and 


hence the objects of a full tensor subcategory 7 of 
the category of all endomorphisms and their inter- 
twiners. There is a dimension function d defined on 
the objects of 7, d(p)=1,2,...,00. If 7, denotes 
the full subcategory whose objects have finite 
dimension, then the following result holds. 


Theorem Typ is equivalent to the tensor category of 
finite-dimensional continuous unitary representa- 
tions of a canonical compact group G. There is a 
canonical field net $ with Bose-Fermi commutation 
relations extending Y such that G is the group of 
automorphisms of Y leaving Y pointwise fixed. 


The first step in the proof is to define and analyze 
the statistics of the representations in question. The 
statistics of an irreducible representation p can be 
classified as being para-Bose or para-Fermi of order 
d(p). The second step is to show that each p of finite 
dimension has a well-defined conjugate up to 
equivalence. The third and most difficult step is 
showing that 7; can be embedded in the tensor 
category of Hilbert spaces. 


The Local Implementation 
of Symmetries 


Gauge symmetry has its associated conservation 
laws in that the different sectors of the last section 
are labeled by conserved quantities such as baryon 
number, lepton number, or electric charge, gener- 
ically called charges. The theory is built round the 
idea of creating charge and elements of the field net 
carry charges. But there should be a dual approach 
based on measuring charges. One would like to 
prove the existence of local conserved currents 
corresponding to these charges. This has not proved 
possible but there is a good substitute, described 
below, which can be regarded as a weak version of a 
quantum Noether theorem. 

If O4 CO) is a strict inclusion of double cones, 
then the theory is said to satisfy the split property if 
there is a type I factor M such that 


Y1(01) CMEC A (O2) 


where a type I factor is a von Neumann algebra 
isomorphic to some B(H). In this case M can be 
chosen in a canonical fashion and there is an 
isomorphism 4 called the universal localizing map 
of B(H) onto M, where H is the underlying Hilbert 
space. We have v(A) — A for A € A(O1). 


Theorem If U is an implementing representation of 
the internal symmetry group G, (U) will be a 
representation of G in M that continues to imple- 
ment the symmetry on U(O¡). If G is a Lie group 
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then the infinitesimal generators in the representa- 
tion are an analog of locally integrated current 
densities. 


Spontaneously Broken Symmetry 


The standard physical example of a spontaneously 
broken symmetry is magnetization. Despite the 
overall rotational symmetry, a magnet picks out a 
preferred direction as its direction of magnetization. 
The chosen state breaks the symmetry. 

The phenomenon of spontaneously broken sym- 
metry involves an interplay of symmetries and 
certain classes of states, vacuum states, ground 
states, or equilibrium states. If such an w is induced 
by a vector cyclic and separating for a local algebra 
A(O), then, as explained in the appendix, given O, 
modular theory yields a canonical unitary represen- 
tation V of the internal symmetry group G: 

gA = VAV}, AEO) 

The results concern the breaking of a one- 
parameter group à> a, of symmetries. More 
precisely, one asks whether wô=0 or not, where 6 
is the infinitesimal generator of À> ay, 


6(F) = lim A^! (a(F) — F) 


where norm convergence is understood and holds on a 
dense domain. 6, the derivation, is an infinitesimal 
symmetry. Goldstone first showed that the sponta- 
neous breaking of such symmetries requires the 
presence of massless bosons. The following result is 
taken from a more modern treatment. Or here denotes 
the double cone whose base is the ball in t= 0 of radius 
R centered on the origin and D the domain of 6. 


Theorem Let 6 be a derivation on a field net X in 
s>1 spatial dimensions such that for FEX 
(Or) ND 


|W dF] < cre (I FO] + [E O1]) + el]óF| 


(i) If lim infr» cr. R 6 U/? =0, then woô — 0. 

(ii) If lim infg oo cg R7 U/? < oo, then wô Æ 0 is 
only possible if tbe spectrum of tbe translations 
coincides with the forward light cone V, and the 
boundary OV, /(0) bas non-trivial spectral mea- 
sure (i.e., there are massless particles in tbe 
theory). 

(iii) If cre is polynomially bounded in R, then 
wo6 0 is only possible if the spectrum of 
translations coincides with V, but there are 
not necessarily any massless particles. 


Symmetries of the S-matrix 


Scattering theory not only allows one to construct 
the multiparticle scattering states but also shows 
that internal symmetries and spacetime symmetries 
continue to act on these states and are therefore 
symmetries of the S-matrix. We can, however, ask 
what are all the symmetries of the S-matrix. An 
answer was provided by Coleman and Mandula, 
who showed that, when there is nontrivial scatter- 
ing, there are no further symmetries of the S-matrix. 


Appendix 


In an effort to make this article more self-contained, 
this appendix collects together a few simple perti- 
nent concepts and results from the theory of 
operator algebras. A C*-algebra is a x*-algebra A 
with a norm ||- || making it into a Banach algebra 
and satisfying 


IA*AI = IIAIF 


for every A € A. Any C*-algebra can be realized as a 
norm closed *-subalgebra of the C*-algebra B(H) of 
all bounded operators on a Hilbert space H. A von 
Neumann algebra R is a C*-algebra that is the dual 
space of a Banach space. This Banach space R,, the 
predual of R, is intrinsically defined. The topology 
on R determined by duality with 7, is called the 
o-topology. B(H) is a von Neumann algebra and its 
predual is the set of trace class operators. Any 
von Neumann algebra can be realized as a o-closed 
unital «-subalgebra of some B(H). 

A state on a C*-algebra A is a positive linear 
functional w of norm 1. If A has a unit I the 
normalization condition can be expressed as 
w(I)=1. Of fundamental importance is the relation 
between representations and states. A representation 
of A on a Hilbert space H is just a structure- 
preserving mapping or morphism of A into B(H). 
For simplicity, we suppose that A has a unit. Given 
a state w, there is an associated representation Tu 
defined by a vector Q such that 7,,(.A)Q) is dense in 
the Hilbert space in question, that is, it is a cyclic 
vector for the representation and 

w(A) = (Q,m(A)Q), AEA 
that is, the cyclic vector implements the given state. 
This is referred to as the GNS construction. Given 
any two such representations, there is a unique 
unitary operator mapping the one cyclic vector onto 
the other and realizing the equivalence of the 
representations. 
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A state of a von Neumann algebra is said to be 
normal if it is continuous in the o-topology. If w is 
normal, then c,,(R) is o-closed. 

An inclusion of unital von Neumann algebras has 
the split property if there is an intermediate type I 
factor, that is, if it has the form R4 C BIH) C Ro. 

The following elementary observation is often 
used in treating symmetries. If œ is an automorphism 
of A with wa™ =w, there is a unique unitary 
operator leaving the cyclic vector Q invariant and 
inducing o in the representation my. In other words, 
UQ=Q) and 


Un, (A)U ! = m (aA) 


If we apply the above lemma to a group G of 
symmetries leaving a state invariant, it yields a 
group U(g) of unitaries satisfying the condition 


U(gh) = U(g)U(h), g.h EG 


since U(g) is uniquely defined by the above 
conditions. 

When there is no invariant state, the situation is 
more complicated. Suppose there is a group G of 
symmetries and a representation 7 of Y where each 
g is unitarily implemented. Thus, there is a unitary 
U(g) with 


U(g)x(A)U(g) ' = (gA), 

All we can now conclude is that 
U(gh) = Z(g, P)U(g)U(b) 

where Z(g, h) is a unitary in W, the commutant of 
YL, satisfying the 2-cocycle identity 

Z(gh, k)Z(g, b) = Z(g. bk)" Z(b, k) 
where £X = U(g)XU(g) !. U is said to be a repre- 
sentation up to a factor. It can be chosen to be a 


representation if the cocycle Z is a coboundary, that 
is, if there is a unitary Y(g) in Y such that 


Y(g Y (b) = Y(gh)Z(g, b) 


AEM 


In general, little is known about solving problems 
of this kind, but there are a number of results when 
m is irreducible and the unitary group of its 
commutant reduces to the circle. 

We turn now to consider the modular theory of 
von Neumann algebras. A vector Q is said to be 
separating for a von Neumann algebra R if AQ —0 
and A € R implies A=0. If Q is both cyclic and 
separating, there is a uniquely determined closed 
antilinear involution S with SAO — A*O for AER. 
If S — JA! is the polar decomposition of S, then the 
unitary operators A" induce automorphisms 6” of R 


and JR] =R’. J is called the modular conjugation, A 
the modular operator, and 6” the modular auto- 
morphisms. The closure of (A/^AQ: A € R,A > 0} 
is a cone, called the natural cone. Every normal state 
of R is implemented by a unique vector in the 
natural cone. If œ is an automorphism of R, there is 
therefore a unique vector Qa in the natural cone 
such that, for every A € R, 


(0,071 (A)0) = (Q4, ANa) 


There is now a canonical unitary operator V, 


defined by 
V,AQ = a(A)Q% 


Va maps the natural cone into itself and a — V, is an 
implementing representation of the group of auto- 
morphisms of R. Under these circumstances, we do 
not have to deal with representations up to a factor. 


See also: Algebraic Approach to Quantum Field Theory; 
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Introduction 


The same symmetries may underlie diverse contexts 
such as phase transitions of crystals (Landau 
theory), fluid dynamics, and problems in biology 
and chemical engineering. Hence, seemingly unre- 
lated systems may exhibit similar phenomena in 
regard to symmetries of patterns and transitions 
between patterns (spontaneous symmetry breaking). 
It is natural to focus attention on aspects of pattern 
formation that are universal or model independent — 
aspects depending on underlying symmetries rather 
than model-specific details. 

The general framework is that the underlying 
system is governed by an evolution equation 


& = f(x) (1 


with symmetry group I. To avoid technicalities, we 
assume that [1] is an ordinary differential equation 
(ODE), the vector field f: R" — R” is as smooth as 
desired, and I is a compact Lie group acting linearly 
on R”. An inner product may be chosen so that 
P acts orthogonally. The vector field in [1] is 
P-equivariant if 

f(yx) =yf(x) for all x eR", yer [2] 
Equivalently, if x(t) is a solution and y € EL, then 
yx(t) is a solution. 

[n this article, we are interested in the dynamics to 
be expected for equivariant vector fields, and 
transitions that arise as parameters are varied. The 
symmetry group T is taken as given, whereas f is a 
general l'-equivariant vector field. (Other features 
such as energy conservation or time reversibility 
must be built into the general setup, but are 
excluded in this article.) 


Isotropy Subgroups and Commuting 
Linear Maps 


Let P be a compact Lie group acting linearly on R”. 
The isotropy subgroup of x € R" is defined to be 


De = TV El: ye =} 


Note that E, —yX,y! for all x e R”, y cT. 


Given an isotropy subgroup X CT, define the 
fixed-point subspace 


Fix X = (y € R”: oy = y for all o € X] 


If f : R" —^ R” is a P-equivariant vector field, then 
f(Fix £) c FixX for each isotropy subgroup X. 
Hence Fix X is flow invariant. 

The normalizer N(X)— (y €T':yEy? — X] is the 
largest subgroup of I that acts on Fix X, and 
fs =F rigs is (N(X)/X)-equivariant. 

An isotropy subgroup X is axial if dim Fix © = 1, 
and then N(X)/X S Z2 or 1. More generally, Y is 
maximal if there are no isotropy subgroups T with 
&CT CYT other than T=2 and T=TP. Then 
N(X)/X acts fixed-point freely on Fix X and the 
connected component of the identity (N(X/X)? = 1, 
SO(2) or SU(2). Correspondingly X is called real, 
complex, or quaternionic. In the complex case 
dim Fix X is even; in the quaternionic case dim Fix 
Y =0mod4. 

The dihedral group P'=D,, of order m is the 
symmetry group of the regular m-gon, m > 3. Its 
standard action on R? is generated by 


» 27/m  —sin inl 


sin2a/m | cos2-/m 


«(2 


For m even, the isotropy subgroups up to conjugacy 
are 


Dyn, Z(K), Z(pk), 1 


where Z;(g) denotes the cyclic group of order j 
generated by g. The maximal isotropy subgroups 
D = Z2(k), Zo(kp) are axial with N(3X)/X = Z2. For 
m odd, Z2(pk) is conjugate to Z(K) leaving three 
conjugacy classes of isotropy subgroups, and 
3 —Z»(&) is axial with N(X)/X — 1. 

The space of commuting linear maps 


Homr(R") =(L : R” > R” linear: 
L^ = 4L for all y er} 


is completely described representation-theoretically. 
Recall that I acts irreducibly on R” if the only 
L-invariant subspaces of R” are R” and {0}. Then 
Homp(R”) is a real division ring (skew field) D = R, 
C or H. The representation is called absolutely 
irreducible when D — R and nonabsolutely irreduci- 
ble when D=C or H. 
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If the action of I is not irreducible, write R” = 
Vi @---@ V, (nonuniquely) as a sum of irreducible 
subspaces. Summing together irreducible subspaces 
that are isomorphic to form isotypic components W 


gives the (unique) isotypic decomposition 
R"=W,@---@W,. If LeHomr(R", then 
L(W;cW; for each j, hence Homp(R”)= 


Homr(W;¡) Y --- + Homp(W,). Each W; consists of 
k; isomorphic copies of an irreducible representation 
with division ring D;. Let M¿(D) denote the space of 
k x k matrices with entries in D. Then 


Homp(R”) = My, (Di) 6&---e Mp (De) — [3] 


Spectral properties of commuting linear maps can be 
recovered from the decomposition [3], paying due 
attention to multiplicity and complex conjugates of 
eigenvalues. 


Equivariant Dynamics 


The dynamics of equivariant systems includes 
(relative) equilibria and periodic solutions, robust 
heteroclinic cycles/networks, and symmetric chaotic 
attractors. 


Equilibria 


Consider the ODE [1] with P-equivariant vector 
field f satisfying [2]. If x(t) = xo is an equilibrium, 
f(xo)=0, then there is a group orbit [xo of 
equilibria. 

Let Y=Y,, be the isotropy subgroup of xo. If 
dim X — dim T, then generically (for an open dense 
set of l'-equivariant vector fields), the eigenvalues of 
(df ),, have nonzero real part, hence xo is hyperbolic. 
If the eigenvalues all have negative real part, then xo 
is asymptotically stable. If at least one eigenvalue 
has positive real part, then xo is unstable. Hyper- 
bolic equilibria are isolated and persist under 
perturbations of f; the perturbed equilibria continue 
to have isotropy X. Since (df), € Homx(R"), 
decomposition [3] for the action of Y on R” 
facilitates stability computations for xp. 

If dim X < dimT, then Txo is a continuous group 
orbit of equilibria. Generically, dim ker (df), = 
dimP—dimY and ker(df)y = {Exo :€ € LT], where 
LT is the Lie algebra of I. The remaining k=n — 
dim FP + dim X eigenvalues generically have nonzero 
real part so Txo is normally hyperbolic. If all k 
eigenvalues have nonzero real part, then Txo is 
asymptotically stable. If at least one has positive real 
part, then Txo is unstable. When N(X)/X is finite, 
generically xo is an isolated equilibrium in Fix X and 
persists as an equilibrium with isotropy X under 
perturbation. 


Relative Equilibria and Skew Products 


A point xo € R" (or the corresponding group orbit 
Lxo) is a relative equilibrium if f(xo) € Tx, Exo = 
LUxo. If xo has isotropy X, then xp is a relative 
equilibrium if f(xo) € LDyxo, where Dy = (N(X)/X)9. 

Write f(xo) — £xo, where € € LDy. The closure of 
the one-parameter subgroup exp(f&) is a maximal 
torus in Dy for almost every €. All maximal tori are 
conjugate with common dimension d=rank Dy. 
The solution x(f)—exp(t£)xo is typically a 
d-dimensional quasiperiodic motion. “Typically” 
holds in both the topological and probabilistic sense 
and there is no phase-locking. When d=1, x(t) is 
periodic, often called a rotating wave. 

Choose a X-invariant local cross section X to the 
group orbit [xp at xo. There is a T-invariant 
neighborhood of lx, that is P-equivariantly diffeo- 
morphic to (P x X)/X, where X acts freely on T x X 
by 


0 (Hie) = (yo ', ox) 


and I acts by left multiplication on the first 
factor. The l-equivariant ODE on (I x X)/® lifts 
to a (T x X)-equivariant skew product on I x X 


V= E(x),  x-b(x) [4] 


where €: X= LT, b: X — X satisfy the X-equivariance 
conditions 


Elax) = Ad,€(x) = 0€(x)0 ' 
h(ox) = ob(x) 


and h(xo) =0. 

Thus, dynamics near the relative equilibrium 
[x9 C R” reduces to dynamics near the ordinary 
equilibrium xy € X for the X-equivariant vector 
b: X-— X, coupled with T^ drifts. In particular, the 
stability of Txo is determined by (dh) 


xo’ 


Periodic Solutions 


A nonequilibrium solution x(t) is periodic if x(t + T) = 
x(t) for some T > 0. The least such T is the (absolute) 
period. The spatial symmetry group A is the isotropy 
subgroup of x(t) for some, and hence all, t € R. The 
periodic solution P={x(t):0<t< T] lies inside 
Fix A. Define the spatiotemporal symmetry group 
X = {y € I :yP =P}. Note that A is a normal subgroup 
of Y and either N/A & S! (P is a rotating wave) or 
X/A = Z, and P is called a standing wave or a discrete 
rotating wave. For each o € X, there exists T, € [0, T) 
such that ox(t) — x(t + T,). The relative period of x(t) 
is the least T > O such that x(T) € xo. 

If dim X = dimT, then generically P is hyperbolic, 
hence isolated, the stability of P is determined by its 


186 Symmetry and Symmetry Breaking in Dynamical Systems 


Floquet exponents, and P persists under perturba- 
tion as a periodic solution with spatial symmetry A 
and spatiotemporal symmetry X. For I infinite and 
N(A)/A finite, generically P is isolated in Fix A and 
the neutral Floquet exponent has multiplicity 
dim T — dim X + 1. 


Relative Periodic Solutions 


A solution x(t) is a relative periodic solution if it is 
not a relative equilibrium and x(T) € Px(0) for some 
T > 0. The least such T is the relative period. The 
spatial symmetry group A = X,;; for some, hence 
all, t. The spatiotemporal symmetry group X is the 
closed subgroup of I generated by A and c, where 
x(T) =0x(0), and generically X/A ~ T! x Zig is a 
maximal topologically cyclic (Cartan) subgroup of 
N(A)/A containing cA. Then x(t) is a (d+ 1)- 
dimensional quasiperiodic motion. 

The dynamics near the relative periodic solution 
is again governed by a skew product. There exists 
n>1 such that o"— exp(z£), where ¿€ LZ(X) 
and Z(%) CT is the centralizer of X. Define 
a-exp(—-£)e. Form a semidirect product A » Zo, 
by adjoining to A an element O of order 2” such 
that OSO! =060 for 6 € A. 

In a comoving frame with velocity £, a neighbor- 
hood of the relative periodic orbit is P-equivariantly 
diffeomorphic to (T x X x $!)/A x Zan, where X is 
a Ax Z»,-invariant cross section, S'=R/2nZ and 
A X Zn, acts on T x X x S! as 


6- (y,x,0) = (675, 6x, 0) 
O - (1,x,0) = (ya , Ox, 4+ 1) 


The F-equivariant ODE on (I x X x S)/A x Zo, 
lifts to a T x (A x Z»,)-equivariant skew product 


3-756050, i=h(x,0), O=1 [S] 


where £: X x S! — LT, b: X x S! 5 X satisfy appro- 
priate A X Z,-equivariance conditions. 


Robust Heteroclinic Cycles 


Heteroclinic cycles, degenerate in systems without 
symmetry, arise robustly in equivariant systems. Let 
X15---,Xm ER” be saddles with W"(x;) — {x;} C 
D'W5(xj,1) (where m+1=1). If X4,...,2, CT are 
isotropy subgroups, W"(x;) C Fix Xj, and x;y1 is a 
sink in Fix X;, then saddle-sink connections from x; 
to x;,, persist for nearby [-equivariant flows. The 
union |)”, PW"(x;) forms a robust heteroclinic cycle 
(see the subsection *Dynamics" for an example). Such 
cycles, when asymptotically stable, are a mechanism 
for intermittency or bursting, notably in rotating 
Rayleigh-Bénard convection (where rolls disappear 


and reorient themselves at approximately 60°), and 
provide a possible intrinsic explanation for irregular 
reversals of the Earth’s magnetic field. 

Asymmetric perturbations (deterministic or noisy) 
destroy the cycles, but the perturbed attractors 
inherit the bursting behavior. 

Establishing the existence of heteroclinic connec- 
tions is often straightforward when dim Fix X; =2 
and nontrivial with dimFix X; >3. Criteria for 
asymptotic stability of heteroclinic cycles are given 
in terms of real parts of eigenvalues of (df)., and 
depend on the geometry of the representation of T. 

Robust cycles exist also between more complicated 
dynamical states such as periodic solutions or chaotic 
sets (cycling chaos). When W“(x;) connects to two or 
more distinct states, the collection of unstable 
manifolds forms a heteroclinic network leading to 
competition between various subnetworks. 


Symmetric Attractors 


Suppose that T is a finite group acting linearly on R”. 
A closed subset A C R" has symmetry groups A= 
(y € T: yx =x for all x € A), E = {y € T: yÅ = A}. 
Here, A is an isotropy subgroup and A C X C N(A). 
In applications, A corresponds to instantaneous 
symmetry and £ to symmetry on average. 

If A is an attractor (a Lyapunov stable w-limit set) 
for a l'-equivariant vector field f : R” — R”, then X 
fixes a connected component of Fix A — L, where L 
is the union of proper fixed-point spaces in Fix A. 

Provided dim Fix A > 3, all pairs A,» satisfying 
the above restrictions arise as symmetry groups of a 
nonperiodic attractor A. If dim Fix A > 5, then A is 
realized by a uniformly hyperbolic (Axiom A) 
attractor. 

If dim Fix A > 3 and X fixes a connected compo- 
nent of Fix A — L, then A is realized by a periodic 
sink provided X/A is cyclic. If dim Fix A — 2, then in 
addition either £ 2 A or 32 N(A). 

Suppose A is an attractor and y € IT — X. Then 
yA (1 A — 0. Varying a parameter, A may undergo a 
symmetry-increasing bifurcation: A grows until it 
collides with yA producing a larger attractor with 
symmetry on average generated by X and y. 

Determining symmetries of an attractor by inspec- 
tion is often infeasible. A detective is a l'-equivariant 
polynomial à : R" — V where every subgroup of T is 
an isotropy subgroup for the action on V, and each 
component of ó is nonzero. Suppose that A C R” is 
an attractor with physical (Sinai-Ruelle-Bowen) 
measure u. By ergodicity, the time average 


e Y 
wa = lim = 
T—oc T 


T 
/ ó(x(t))dt € V 
0 
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is well defined for almost every trajectory x(t) in 
supp. Generically, €, —X4 so computing the 
symmetry of A reduces to computing the symmetry 
of a point. 

If T is an infinite compact Lie group, and A is an 
w-limit set containing points of trivial isotropy, then 
A cannot be uniformly hyperbolic. Hence partially 
hyperbolic flows arise naturally in systems with 
continuous symmetry. Consider the skew product 
[4] where X = 1 and 5: X — X possesses a hyperbolic 
basic set A C X with equilibrium measure y (for a 
Holder potential). Let y denote Haar measure on I. 
Then A xT is partially hyperbolic, and yp x v is 
ergodic (even Bernoulli) for an open dense set of 
equivariant flows. Such stably ergodic flows possess 
strong statistical properties (rapid decay of correla- 
tions, central-limit theorem); a possible explanation 
for hypermeander (Brownian-like motion) of spiral 
waves in planar excitable media. 


Forced Symmetry Breaking 


In applications, symmetry is not perfect and account 
should be taken of I’-equivariant perturbations of 
[1] for I" a subgroup of T (including I'= 1). This 
topic is not discussed in this article, except in the 
subsections “Robust  heteroclinic cycles" and 
“Branching patterns and finite determinacy.” 


Equivariant Bifurcation Theory 


Consider families of ODEs x = f(x, A), with bifurca- 
tion parameter AER and vector field f:R” x 
R — R” satisfying f(0,0)=0 and the l'-equivariance 
condition 


f (vx. A) = yf (x. A) 
for all x € R”, AER, yer 


A local bifurcation from the equilibrium x=0 
occurs if (df) 9 9 is nonhyperbolic. The center sub- 
space E“ is the sum of generalized eigenspaces 
corresponding to eigenvalues on the imaginary 
axis, and is T-invariant. By center manifold theory, 
local dynamics ((x, A) near (0,0)) are captured by the 
center manifold W*. After center manifold reduction 
(or Lyapunov-Schmidt reduction if the focus is on 
equilibria), it may be assumed that R" — E*. 

If (df)g 9 possesses zero eigenvalues, then there is 
a steady-state bifurcation. Generically, (df)o 9 — O 
and E* is absolutely irreducible. There are two 
subcases. 

If T acts trivially on R”, then n = 1 and generically 
there is a saddle-node (or limit point) bifurcation 
where the zero sets of f(x,A) and +x? + AÀ are 
diffeomorphic for (x,A) near (0,0). Higher-order 


degeneracies can be treated using singularity theory. 
The equilibria and their stability determines the 
local dynamics. All bifurcating equilibria have 
isotropy I’, so there is no symmetry breaking. 

From now on, consider the remaining subcase 
where T acts absolutely irreducibly and nontrivially 
on R”. Then FixT — (0], f(0,A) = 0, and (df)o, , = 
c(A)I, where generically c'(0) 4 0. Assume that 
c'(0) > 0, so the “trivial solution" x =0 is asympto- 
tically stable subcritically (A< 0) and unstable 
supercritically (A > 0). Bifurcating solutions lie out- 
side Fix l and hence there is spontaneous symmetry 
breaking. 


Axial Isotropy Subgroups 


The “equivariant branching lemma" guarantees 
branches of equilibria with isotropy X for each 
axial isotropy subgroup. There are three associated 
branching patterns, see Figure 1. 

If N(X)/X— Z5, then fs is odd. Generically, 
O?fs.(0, 0) Z 0, since (xs +--+ xix is DP-equivar- 
iant, and there are two branches of equilibria 
bifurcating supercritically or subcritically together, 
and lying on the same group orbit. The branches 
form a symmetric pitchfork whose direction of 
branching is determined by sgn ô? fs(0, 0). 

If N(X)/X = 1, then generically fs; is even. If all 
quadratic l'-equivariant maps vanish on Fix X, then 
the bifurcation is sub/supercritical depending on 
sgn Ə fs:(0, 0) but the branches lie on distinct group 
orbits. This is an asymmetric pitchfork. 

If O2f5(0,0) Z 0, then the equilibria exist tran- 
scritically: for À < 0 and A > 0. 

The natural actions of D,, on R? are absolutely 
irreducible. The axial branches are symmetric 
pitchforks for m > 4 even, asymmetric pitchforks 
for m > 5 odd, and transcritical for m=3. 

The actions of D,,,m>5 odd, provide the 
simplest instances of hidden symmetries, where 
certain N(X)/X-equivariant mappings on Fix X do 
not extend to smooth [-equivariant mappings on R". 


Nonaxial Maximal Isotropy Subgroups 


For X a real maximal isotropy subgroup, dim Fix X 
odd, there exist branches of equilibria with isotropy 


ES egt 


(a) (b) (c) 
Figure 1 Axial branches: (a) supercritical symmetric pitchfork, 


(b) supercritical asymmetric pitchfork, and (c) transcritical 
branches. 
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X. When dimFixX is even, there are examples 
where equilibria exist and examples where no 
equilibria exist. For X complex or quaternionic, 
there exist branches of rotating waves with isotropy 
X. In the quaternionic case, the rotating waves 
foliate the SU(2) group orbits according to the Hopf 
fibration. 


Submaximal Isotropy Subgroups 


It has been conjectured falsely that steady-state 
bifurcation leads generically to equilibria only with 
maximal isotropy. The simplest counterexample is 
the 24-element group I = 7577; generated by 


0 10 10 0 
sel 1] pali 9 
100 0 0 -1 


(Alternatively, rT = T @ Z2(— 13), where T C SO(3) is 
the tetrahedral group.) 

The isotropy subgroup *==Zo2(x) has two- 
dimensional fixed-point subspace Fix X = {(x, y, 0)). 
The only one-dimensional fixed-point spaces con- 
tained in Fix X are the x- and y-axes. The general 
l'-equivariant vector field is 


After scaling, 
g(x^,^, 25,2) 
=)= = ay -bz--o(x',y*,2,X) [6| 


Restricting to Fix X and dividing out the axial 
solutions x — 0 and y — 0 yields at lowest order the 
equations A—x^-c ay! =y? -- bx?. | Submaximal 
solutions exist provided sgn(a — 1) =sgn(b — 1). 

In general, the existence of equilibria with 
submaximal isotropy must be treated on a case- 
by-case basis (for each absolutely irreducible repre- 
sentation of I and isotropy subgroup >). 


Asymptotic Stability 


Subcritical and axial transcritical branches are 
automatically unstable. Moreover, the existence of 
a quadratic l'-equivariant mapping q: IR" — R” and 
x €FixX such that (dq), has eigenvalues with 
nonzero real part guarantees that branches of 
equilibria with axial isotropy X are generically 
unstable (even when q|,;,5, = 0). 

There are no general results for asymptotic 
stability, and calculations must be done on a case- 
by-case basis. (The remarks in the subsection 
“Equilibria” are useful here.) 


Branching Patterns and Finite Determinacy 


The following notion of finite determinacy is based 
on equivariant transversality theory. Assume [ acts 
absolutely irreducibly. Consider the set F of 
P-equivariant vector fields f:R" x R — R” satisfy- 
ing (df)y9 —0. For an open dense subset of F, 
branches of relative equilibria near (0,0) are 
normally hyperbolic. The collection of branches of 
relative equilibria, together with their isotropy type, 
direction of branching, and stability properties, is 
called a branching pattern. These persist under small 
perturbations and are finitely determined: there exist 
q-—qr > 2 and an open dense subset 4/(q) C F such 
that the branching patterns of f and f--g are 
identical for f €U(q), g € F, provided g(x,A)— 
o(llx |l"). 

Furthermore, branching patterns are strongly 
finitely determined: there exist d > 2 and an open 
dense subset S(d)C F such that the branching 
patterns of f and f +g are identical for f € S(d) 
and all (not necessarily equivariant) g satisfying 
g(x, A) — o(||x|l^). 

For example, consider the hyperoctahedral group 
$,75,n > 1. Here S, acts by permutations of the 
coordinates (x1,...,x,) and Z5 consists of diagonal 
matrices with entries +1. Let — T Z5, where T C S, 
is a transitive subgroup. Then I acts absolutely 
irreducibly on R” and is strongly 3-determined. 
Submaximal branches of equilibria exist except when 
T=S,, Y =A, and, if =6, T 2PGL;(F;). 


Dynamics 


Absolutely irreducible representations have arbitra- 
rily high dimension, so -steady-state bifurcation 
leads to rich dynamics. The group T=Z3Z3 with 
sgn(a — 1) Z sgn(b — 1) and a+b » 2 in [6] yields 
asymptotically stable heteroclinic cycles with planar 
connections connecting equilibria in the x-, y- and 
z-axes (see Figure 2). In R^, there is the possibility of 
instant chaos where chaotic dynamics bifurcates 
directly from the equilibrium 0. 


Figure 2 Robust heteroclinic cycle for the group T = Zax Z3. 
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In the absence of quadratic equivariants, the 
invariant-sphere theorem gives an open set of 
equivariant vector fields for which an attracting 
normally hyperbolic flow-invariant (n — 1)-dimen- 
sional sphere bifurcates supercritically. This simpli- 
fies computations of nontrivial dynamics. 


Hopf Bifurcation and Mode Interactions 
Equivariant Hopf Bifurcation 


The setting is the same as in the last section, except 
that L=(df)o 9 has imaginary eigenvalues ciw of 
algebraic and geometric multiplicity 2/2. Generic- 
ally, R” — E* is P-simple: either the direct sum of 
two isomorphic absolutely irreducible subspaces, or 
nonabsolutely irreducible. 

By Birkhoff normal-form theory (see below), for 
any k>1 there is a D-equivariant change of 
coordinates after which f(x, A) =f(x, A) + o(|lx|l^), 
where f, is (I x S')-equivariant. Here S!= 
lexp(£L): t € R} acts freely on R” and T x S! acts 
complex irreducibly (D— C). Hence, dimFix/ is 
even for each isotropy subgroup J CT x S', and 
N(J)/] & S! when J is maximal. The equivariant 
Hopf theorem guarantees, generically, branches of 
rotating waves with absolute period approximately 
27/w for each maximal isotropy subgroup J. 

The notions of finite and strong finite determinacy 
extend to complex irreducible representations and the 
rotating waves persist as periodic solutions for the 
original D-equivariant vector field f. Define the 
spatial and spatiotemporal symmetry groups A C 
X CT as in the subsection “Periodic solutions.” Then 
] —1((c,0(0): c € X) is a twisted subgroup, with 
0:X: — S! a homomorphism and A=] n T = ker 0. 

In the non-symmetry-breaking case, where F acts 
trivially on R?, phase-amplitude reduction leads to 
Z2-equivariant amplitude equations on R and 
higher-order degeneracies are amenable to Z232- 
equivariant singularity theory. Similar comments 
apply to O(2)-equivariant Hopf bifurcation where 
the amplitude equations are Da4-equivariant. The 
technique fails for general groups I. 


Mode Interactions and Birkhoff Normal Form 


Steady-state and Hopf bifurcations are codimen- 
sion 1 and occur generically in one-parameter 
families of P-equivariant vector fields. Multipara- 
meter families may undergo higher-codimension 
bifurcations called mode interactions. Suppressing 
parameters, steady-state/steady-state bifurcation 
occurs when R” = E* = V4 @ V2, where V4 and V5 
are absolutely irreducible and L —(df)y has zero 
eigenvalues. If V; and V2 are nonisomorphic then 


L=0, otherwise L is nilpotent and there is an 
equivariant Takens-Bogdanov bifurcation. Similarly, 
there are codimension-2 steady-state/Hopf and Hopf/ 
Hopf bifurcations. 

Write L — S + N (uniquely), where $ is semisimple, 
N is nilpotent, and SN = NS. Then {exp tS: t € R} is 
a torus T’, where p > 0 is the number of rationally 
independent eigenvalues for L. 

For each k > 1, there is a P-equivariant degree-k 
polynomial change of coordinates P : R" — R" satis- 
fying P(0) 20, (dP)y =I transforming f to Birkhoff 
normal form fp + o(llx |^), where f, is (D x T?)- 
equivariant. 

If N Æ 0, then (exp £N?: t € R} S R and f, can be 
chosen so that the nonlinear terms are (r x T’ x R)- 
equivariant. The linear terms are not R-equivariant. 

The study of mode interactions proceeds by first 
analyzing (I x TP)-equivariant normal forms, then 
considering exponentially small effects of the 
l'-equivariant tail. Versions of the equivariant branch- 
ing lemma and equivariant Hopf theorem establish 
existence of certain solutions. There are numerous 
examples of robust heteroclinic cycles connecting 
(relative) equilibria and periodic solutions, symmetric 
chaos, and symmetry-increasing bifurcations. 


Bifurcations from Relative Equilibria 
and Periodic Solutions 


Using the skew product [4], bifurcations from 
a relative equilibrium with isotropy X for a 
P-equivariant vector field reduce to bifurcations 
from a fully symmetric equilibrium for a 
Y-equivariant vector field h coupled with F drifts. 
If b possesses (relative) equilibria or periodic 
solutions, then the drift is determined generically as 
in the subsections “Relative equilibria and skew 
products” and “Relative periodic solutions.” Never- 
theless, solving the drift equation can be useful for 
understanding behavior in physical space. This is 
facilitated by making equivariant polynomial 
changes of coordinates (yO(x), P(x)) putting h into 
Birkhoff normal form and simplifying £. 
Bifurcations from (relative) periodic solutions also 
reduce, mainly, to bifurcations from equilibria (with 
enlarged symmetry group). Based on the discussion 
in the subsection “Relative periodic solutions," it 
suffices to consider bifurcations from isolated 
periodic solutions P = [x(t)) with spatial symmetry 
A and spatiotemporal symmetry X. Write x(T) = 
ox(0), where T is the relative period and c is chosen 
so that the automorphism óma 66,6 € A, has 
finite order k. Form the semidirect product 
A X Z5, by adjoining to A an element 7 of order 
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2k such that 7! ór =o 60, for 6 € A. Codimension- 
1 bifurcations from P are in one-to-one correspon- 
dence (modulo tail terms) with bifurcations from 
fully symmetric equilibria for a (A x 'Z,)-equivariant 
vector field. In particular, period-preserving and 
period-doubling bifurcations from P reduce to 
steady-state bifurcations, and | Naimark-Sacker 
bifurcations reduce to Hopf bifurcations. This 
framework incorporates issues such as suppression 
of period doubling. Similar results hold for higher- 
codimension bifurcations. 

The skew products [4] and [5] are valid for proper 
actions of certain noncompact Lie groups I pro- 
vided the spatial symmetries are compact, leading to 
explanations of spiral and scroll wave phenomena in 
excitable media. 

When the spatial symmetry group is noncompact, 
E* may be infinite-dimensional and center manifold 
reduction may break down due to continuous- 
spectrum issues. For Euclidean symmetry, there 
is a theory of modulation or Ginzburg-Landau 
equations. 


See also: Bifurcation Theory; Bifurcations in Fluid 
Dynamics; Bifurcations of Periodic Orbits; Central 
Manifolds, Normal Forms; Chaos and Attractors; 
Electroweak Theory; Finite Group Symmetry Breaking; 
Hyperbolic Dynamical Systems; Quantum Spin Systems; 
Quasiperiodic Systems; Singularity and Bifurcation 
Theory. 
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name of “reduction” that restricts the study of its 
dynamics to a system of smaller dimension. This 
procedure is also used in a purely geometric context 
to construct new nontrivial manifolds having var- 
ious additional structures. 

Most of the reduction methods can be seen as 
constructions that systematize the techniques of 
elimination of variables found in classical 
mechanics. These procedures consist basically of 
two steps. First, one restricts the dynamics to flow- 
invariant submanifolds of the system in question 
and, second, one projects the restricted dynamics 
onto the symmetry orbit quotients of the spaces 
constructed in the first step. Sometimes, the 


flow-invariant manifolds appear as the level sets of a 
momentum map induced by the symmetry of the 
system. 


Symmetry Reduction 
The Symmetries of a System 


The standard mathematical fashion to describe the 
symmetries of a dynamical system (see Dynamical 
Systems in Mathematical Physics: An Illustration 
from Water Waves) X € X(M) defined on a mani- 
fold M(X(M) denotes the Lie algebra of smooth 
vector fields on M endowed with the Jacobi-Lie 
bracket [-,-]) consists in studying its invariance 
properties with respect to a smooth Lie group 
$:Gx M—M (continuous symmetries) or Lie 
algebra | ó:g— X(M) (infinitesimal symmetry) 
action. Recall that is a (left) action if the map 
g € Go 4(g,-) € Diff(M) is a group homomorph- 
ism, where Diff(M) denotes the group of smooth 
diffeomorphisms of the manifold M. The map ó is a 
(left) Lie algebra action if the map € € qr o(£) € 
X(M) is a Lie algebra antihomomorphism and the 
map (m,£) € Mx ae ó(£)(m) € TM is smooth. The 
vector field X is said to be G-symmetric whenever it 
is equivariant with respect to the G-action 6, that is, 
Xo@,=T®,0X, for any g€ G. The space of 
G-symmetric vector fields on M is denoted by 
X(M)". The flow F; of a G-symmetric vector 
field Xe X(M)U is G-equivariant, that is, 
F; o 6, = È, o F;, for any g € G. The vector field X is 
said to be q-symmetric if [G(£), X] — 0, for any £ € q. 

If q is the Lie algebra of the Lie group G (see Lie 
Groups: General Theory) then the infinitesimal gen- 
erators ém € X(M) of a smooth G-group action 


defined by 


$(expt£,m), &£€a, meM 


a0 
constitute a smooth Lie algebra g-action and we 
denote in this case ó(£) = £y. 

If m € M, the closed Lie subgroup G,,:— {g € G| 
D(g,m)=m) is called the isotropy or symmetry 
subgroup of m. Similarly, the Lie subalgebra 
(,:—(£€alo(£)(m)—0] is called the isotropy or 
symmetry subalgebra of m. If q is the Lie algebra of 
G and the Lie algebra action is given by the 
infinitesimal generators, then q,, is the Lie algebra 
of G,,. The action is called free if G,, — (e] for every 
m € M and locally free if q,,, =(0) for every m € M. 
We will write interchangeably ®(g,m)=®,(m)= 
Dd” (g) =g - m, for m € M and g € G. 

In this article we will focus mainly on continuous 
symmetries induced by proper Lie group actions. 
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The action ® is called proper whenever for any 
two convergent sequences [7myl,- and (g,-m,:— 
Dl2n, Mn) h pex in M, there exists a convergent 
subsequence {g»,},cx in G. Compact group actions 
are obviously proper. 


Symmetry Reduction of Vector Fields 


Let M be a smooth manifold and G a Lie group 
acting properly on M. Let X € X(M)? and F, be its 
(necessarily equivariant) flow. For any isotropy 
subgroup H of the G-action on M, the H-isotropy 
type submanifold Mj; := {m € M|G,,=H} is pre- 
served by the flow F,. This property is known as the 
law of conservation of isotropy. The properness of 
the action guarantees that G,, is compact and that 
the (connected components of) My are embedded 
submanifolds of M for any closed subgroup H of G. 
The manifolds My, are, in general, not closed in M. 
Moreover, the quotient group N(H)/H (where N(H) 
denotes the normalizer of H in G) acts freely and 
properly on My. Hence, if ty : My > My/(N(H)/H) 
denotes the projection onto orbit space and 
iy: My —M is the injection, the vector field X 
induces a unique vector field X" on the quotient 
Mgu/(N(H)/H) defined by XP o ny — Tzu 0 X oip, 
whose flow FH is given by Fi! o my =p o Fp o ig. We 
will refer to X" € X(My /(N(H)/H)) as the H-isotropy 
type reduced vector field induced by X. 

This reduction technique has been widely 
exploited in handling specific dynamical systems. 
When the symmetry group G is compact and we are 
dealing with a linear action, the construction of the 
quotient My/(N(H)/H) can be implemented in a 
very explicit and convenient manner by using the 
invariant polynomials of the action and the theo- 
rems of Hilbert and Schwarz—Mather. 


Symplectic Reduction 


Symplectic or Marsden—Weinstein reduction is a 
procedure that implements symmetry reduction for 
the symmetric Hamiltonian systems defined on a 
symplectic manifold (M,w). The particular case in 
which the symplectic manifold is a cotangent bundle 
is dealt with separately (see Cotangent Bundle 
Reduction). We recall that the Hamiltonian vector 
field X, € X(M) associated to the Hamiltonian 
function h € C*(M) is uniquely determined by the 
equality w(X;,,-)=dh. In this context, the symme- 
tries ®:G x M—M of interest are given by sym- 
plectic or canonical transformations, that is, 
$^. — uw, for any g € G. For canonical actions each 
G-invariant function h € C*(M)* has an associated 
G-symmetric Hamiltonian vector field X;. A Lie 
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algebra action ọ is called symplectic or canonical if 
Ew —0 for all £ € q, where £ denotes the Lie 
derivative operator. If the Lie algebra action is 
induced from a canonical Lie group action by taking 
its infinitesimal generators, then it is also canonical. 


Momentum Maps 


The symmetry reduction described in the previous 
section for general vector fields does not produce a 
well-adapted answer for symplectic manifolds (M, w) 
in the sense that the reduced spaces My/(N(H)/H) 
are, in general, not symplectic. To solve this 
problem one has to use the conservation laws 
associated to the canonical action, which often 
appear as momentum maps. 

Let G be a Lie group acting canonically on the 
symplectic manifold (M, w). Suppose that for any £ € q, 
the vector field £y is Hamiltonian, with Hamiltonian 
function J^ € C* (M) and that € € qr JS € C*(M) is 
linear. The map J:M— a* defined by the relation 
(2), €) = J*(z), for all ¿€g and z € M, is called 
a momentum map of the G-action (see Hamiltonian 
Group Actions). Momentum maps, if they exist, are 
determined up to a constant in q* for any connected 
component of M. 


Examples 1 


(i) (Linear momentum) The phase space of an 
N-particle system is the cotangent space T*R?N 
endowed with its canonical symplectic struc- 
ture. The additive group R?, whose Lie algebra 
is abelian and is also equal to R?, acts 
canonically on it by spatial translation on each 
factor: v - (q; p) ^ (q; + v, p), with i— 1,..., N. 
This action has an associated momentum map 
J: T RÓN | R?, where we identified the dual of 
R? with itself using the Euclidean inner pro- 
duct, which coincides with the classical linear 
momentum J(q;, p’) =>} ¡P;. 

(ii) (Angular momentum) Let SO(3) act on R? 

and then, by lift, on T*R?, that is, A-(q, p) = 

(Aq, Ap). This action is canonical and has as 

associated momentum map J: T*R? — so(3)' = 

R?, the classical angular momentum J(q, p) 

qx p. 

(Lifted actions on cotangent bundles) The 

previous two examples are particular cases of 

the following situation. Let ®: G x M—M bea 
smooth Lie group action. The (left) cotangent 
lifted action of G on T*O is given by g-ag:= 

T;,9,5(ag) for ge G and ag € T'O. Cotan- 

gent lifted actions preserve the canonical 1-form 

on T*O and hence are canonical. They admit 
an associated momentum map J:T*O—q* 


— 
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given by (J(aq),€)=agq(€o(q)), for any ag € 
T*O and any € € q. 

(Symplectic linear actions) Let (V,w) be a 
symplectic linear space and let G be a subgroup 
of the linear symplectic group, acting naturally 
on V. By the choice of G this action is canonical 
and has a momentum map given by 
(J(v), €) - (1/2)w(£v(v),v), for E € q and ve V 


arbitrary. 


—á 


(1v 


Properties of the Momentum Map 


The main feature of the momentum map that makes it of 
interest for use in reduction is that it encodes conserva- 
tion laws for G-symmetric Hamiltonian systems. 
Noether's theorem states that the momentum map is a 
constant of the motion for the Hamiltonian vector field 
X, associated to any G-invariant function h € C* (M je 
(see Symmetries and Conservation Laws). 

The derivative TJ of the momentum map satisfies 
the following two properties: range (T, J) = (Am) and 
ker T,,J —(q-m)', for any me€ M, where (Am) 
denotes the annihilator in q* of the isotropy subalgebra 
Un of m, Qm := Tm(G m) = {Em(m)|€ € qj E the 
tangent space at m to the G-orbit that contains this 
point, and (q-7)* is the symplectic orthogonal space 
to q-m in the symplectic vector space (T,,,M,w(1)). 
The first relation is sometimes called the bifurcation 
lemma since it establishes a link between the symmetry 
of a point and the rank of the momentum map at 
that point. 

The existence of the momentum map for a given 
canonical action is not guaranteed. A momentum 
map exists if and only if the linear map p:[£] € 
a/[a, a] — [w(Em, -)] € H'(M,R) is identically zero. 
Thus, if H'(M,R)=0 or a/[a,a] - H'(ag, R) 20 
then p — 0. In particular, if q is semisimple, the 
“first Whitehead lemma" states that H'(a, R)=0 
and therefore a momentum map always exists for 
canonical semisimple Lie algebra actions. 

A natural question to ask is when the map 
(q,[-,-]) —>(C*(M), [-,-)) defined by £5J5,£ € q, 
is a Lie algebra homomorphism, that is, 
JEN= ie, Eneg. Here (-,-):C*(M)x 
C*(M) — C*(M) denotes the Poisson bracket asso- 
ciated to the symplectic form w of M defined by 
(f, b) :— (Xr, Xp), f, b € C*(M). This is the case if 
and only if T;J(&w(z)) — —ad;J(z), for any £ € q, 
z€M, where ad' is the dual of the adjoint 
representation ad:(£,7) € a x q= [£, 9] € q of q on 
itself. A momentum map that satisfies this relation 
in called infinitesimally equivariant. The reason 
behind this terminology is that this is the infinitesi- 
mal version of global or coadjoint equivariance: J is 
G-equivariant if Ad; o J — J o D, or, equivalently, 


Jó“Elg-2)=J%2), for all ge G, £c g, and ze M; 
Ad' denotes the dual of the adjoint representation 
Ad of G on q. Actions admitting infinitesimally 
equivariant momentum maps are called Hamilto- 
nian actions and Lie group actions with coadjoint 
equivariant momentum maps are called globally 
Hamiltonian actions. If the symmetry group G is 
connected then global and infinitesimal equivariance 
of the momentum map are equivalent concepts. If q 
acts canonically on (M,w) and H!(a, R) — {0} then 
this action admits at most one infinitesimally 
equivariant momentum map. 

Since momentum maps are not uniquely defined, 
one may ask whether one can choose them to be 
equivariant. It turns out that if the momentum map is 
associated to the action of a compact Lie group, this 
can always be done. Momentum maps of cotangent 
lifted actions are also equivariant as are momentum 
maps defined by symplectic linear actions. Canonical 
actions of semisimple Lie algebras on symplectic 
manifolds admit infinitesimally equivariant momen- 
tum maps, since the "second Whitehead lemma" 
states that H?(a, R) — 0 if q is semisimple. We shall 
identify below a specific element of H? (q, R) which is 
the obstruction to the equivariance of a momentum 
map (assuming it exists). 

Even though, in general, it is not possible to 
choose a coadjoint equivariant momentum map, it 
turns out that when the symplectic manifold is 
connected there is an affine action on the dual of the 
Lie algebra with respect to which the momentum 
map is equivariant. Define the nonequivariance 
l-cocycle associated to J as the map a0:G—> q* 
given by geJ(G9,()) - Ad; (J(z)). The connectivity 
of M implies that the right-hand side of this equality 
is independent of the point z € M. In addition, c is a 
(left) q*-valued 1-cocycle on G with respect to the 
coadjoint representation of G on q^", that is, 
o(gh) =o(g) + Ad; .o(h) for all g,h € G. Relative to 
the affine action O:Gxq'—5aq' given by 
(g, 1) — Ad, u + olg), the momentum map J is 
equivariant. The “reduction lemma,” the main 
technical ingredient in the proof of the reduction 
theorem, states that for any m € M we have 


(yy = gm N ker Tm J = gm N (q-m)* 


where qj, is the Lie algebra of the isotropy group 
Gym) of J(m) € q* with respect to the affine action 
of G on q induced by the nonequivariance 
l-cocycle of J. 


The Symplectic Reduction Theorem 


The symplectic reduction procedure that we now 
present consists of constructing a new symplectic 
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manifold out of a given symmetric one in which the 
conservation. laws encoded in the form of a 
momentum map and the degeneracies associated to 
the symmetry have been eliminated. This strategy 
allows the reduction of a symmetric Hamiltonian 
dynamical system to a dimensionally smaller one. 
This reduction procedure preserves the symplectic 
category, that is, if we start with a Hamiltonian 
system on a symplectic manifold, the reduced system 
is also a Hamiltonian system on a symplectic 
manifold. The reduced symplectic manifold is 
usually referred to as the symplectic or Marsden- 
Weinstein reduced space. 


Theorem 2 Let 6:Gx M—M be a free proper 
canonical action of tbe Lie group G on tbe connected 
symplectic manifold (M,w). Suppose that this action 
bas an associated momentum map J: M— q*, with 
nonequivariance 1-cocycle o: G — q*. Let y € q* be 
a value of J and denote by G, the isotropy of y. under 
the affine action of G on q*. Then: 


(i) The space M, := J (u)/ G, is a regular quotient 
manifold and, moreover, it is a symplectic 
manifold with symplectic form w, uniquely 
characterized by the relation 


pw 

The maps i,:] (u)—M and Ty: J” (u) —> 

J  (u)/G, denote the inclusion and the projec- 

tion, respectively. The pair (M,,,w,) is called the 

symplectic point reduced space. 

Let b € C*(M)* be a G-invariant Hamiltonian. 

The flow F; of the Hamiltonian vector field X, 

leaves the connected components of J(u) 

invariant and commutes witb tbe G-action, so 

it induces a flow Ff on M, defined by 

Ty, OF, od, =Fp O Ty. 

(iii) The vector field generated by the flow F? on 
(M,,w,) is Hamiltonian with associated 
reduced Hamiltonian function b, € C*(M,) 
defined by h,on,=hoi,. The vector fields 
X, and X,, are my,-related. The triple 
(Mi, wy, hu) is called the reduced Hamiltonian 
system. 

(iv) Let k € C*(M)* be anotber G-invariant func- 
tion. Then {h,k} is also G-invariant and 
(b, k],, — (hy, b.) , where {-,-}q, denotes the 
Poisson bracket associated to the symplectic 
form w, on M,. 


* — 
T Op = 


x 


(ii 


Reconstruction of Dynamics 


We pose now the question converse to the reduction 
of a Hamiltonian system. Assume that an integral 
curve c,(t) of the reduced Hamiltonian system X4, 


194 Symmetry and Symplectic Reduction 


on (M,,,w,,) is known. Let my € J ' (i) be given. One 
can determine from this data the hd, curve of 
the Hamiltonian system X, with initial condition 
mo. In other words, one can reconstruct the solution 
of the given system knowing the corresponding 
reduced solution. The general method of reconstruc- 
tion 1s the following. Pick a smooth curve d(t) in 
J(u) such that d(0) = mo and x, (d(t)) = c, (t). Then, 
if c(t) denotes the integral curve of X, with 
c(0) —710, we can write c(t)=g(t)-d(t) for some 
smooth curve g(t) in G, that is obtained in two 
steps. First, one finds a smooth curve €(t) in q, 
such that €(t)u(d(t)) =X,(d(t)) — d(t). With the 
Elt) € a, just obtained, one solves the nonautono- 
mous differential equation £(t)— T,La£(t) on G, 
with g(0) — 


The Orbit Formulation of the Symplectic 
Reduction Theorem 


There is an alternative approach to the reduction 
theorem which consists of choosing as numerator of 
the symplectic reduced space the group invariant 
saturation of the level sets of the momentum map. 
This option produces as a result a space that is 
symplectomorphic to the Marsden-Weinstein quo- 
tient but presents the advantage of being more 
appropriate in the context of quantization problems. 
Additionally, this approach makes easier the com- 
parison of the symplectic reduced spaces corres- 
ponding to different values of the momentum map 
which is important in the context of Poisson 
reduction (see Poisson Reduction). In carrying out 
this construction, one needs to use the natural 
symplectic structures that one can define on the 
orbits of the affine action of a group on the dual of 
its Lie algebra and that we now quickly review. 

Let G be a Lie group, o:G—Q* a coadjoint 
1-cocycle, and y € q*. Let O, be the orbit through y 
of the affine G-action on q* associated to o. If 
X:gxg R defined by 


d 
EET d 


is a real-valued Lie algebra 2-cocycle (which is 
always the case if o is the derivative of a smooth 
real-valued group 2-cocycle or if o is the non- 
equivariance 1-cocycle of a momentum map), that 
is, X:g x q —R is skew-symmetric and X([£, n], C) + 
X(In, C], €) + X([6,£], m) 20 for all £, y, € Eg, then 
the affine orbit O, is a symplectic manifold with 
G-invariant symplectic structure wu, given by 


o, (¥) (Eq (v) ng (v)) = xt, 


(o(exp(t€), n) 


t=0 


[en] FEE [1] 


for arbitrary v € O,, and £y € q. The symbol 
Eq (v) := —ad; eV + X(€,-) denotes the infinitesimal 
generator of "the affine action on q* associated to 
¿ € q. The symplectic structures wo* on O, are 
called the (+)-orbit or Kostant-Kirillov-Souriau 
(KKS) symplectic forms. 

This symplectic form can be obtained from 
Theorem 2 by considering the symplectic reduction 
of the cotangent bundle T*G endowed with the 
magnetic symplectic structure Wy := Wean — T By, 
where Wean is the canonical symplectic form on 
T*G,x:T*G — G is the projection onto the base, 
and By € N2(G)” is a left-invariant 2-form on G 
whose value at the identity is the Lie algebra 
2-cocycle ©:q xg—R. Since X is a cocycle, it 
follows that By is closed and hence Ws is a 
symplectic form. Moreover, the lifting of the left 
translations on G provides a canonical G-action on 
T*G that has a momentum map given by 
J(g, 1) = O(g, 1), (8, u) € Gx g* ~ T'G, where the 
trivialization G x q* ~ T*G is obtained via left 
translations. Symplectic reduction using these ingre- 
dients yields symplectic reduced spaces that are 
naturally symplectically diffeomorphic to the affine 
orbits O,, with the symplectic form [1]. 


Theorem 3 (Symplectic orbit reduction). Let ®:G x 
M — M be a free proper canonical action of the Lie 
group G on tbe connected symplectic manifold (M, w). 
Suppose that this action bas an associated momentum 
map J:M—«q*, with nonequivariance 1-cocycle 
o:G=q*. Let O,:2 G: p C q* be the G-orbit of the 
point y € q* with respect to the affine action of G on 
g* associated to c. Then the set Mo,:= J  (6,)/ G 
is a regular quotient symplectic manifold with 
the symplectic form wo, sig oa characterized by 
the relation ij w — n5, ^ " Jo, Wo, , where Jo, is 
the restriction of J to T^i Oj) ) and we, is the (+)- 
symplectic structure on the ir orbit O. The maps 
ipo, :J (O,) — M and ro, : J” (On) )> Mo, are nat- 
ural injection and the projection, aan The pair 
(Mo, ,wo,) is called the symplectic orbit reduced space. 
Statements similar to (11)-(iv) in Theorem 2 can be 
formulated for the orbit reduced spaces (Mo, , wo, ). 


We emphasize that given a momentum value y € q*, 
the reduced spaces M,, and Mo, are symplectically 
diffeomorphic via the projection to the quotients of the 
inclusion J ! (uy) J^! 

Reduction at a general point can be replaced by 
reduction at zero at the expense of enlarging the 
manifold by the affine orbit. Consider the canonical 
diagonal action of G on the symplectic difference 
M 6 Oi» which is the manifold M x O, with the 
symplectic form mjw -— TWO; where T1:Mx 
O, 5 M and mz:M x O, =O, are the projections. 


A momentum map for this action is given by Jo 
T] -mM 60, — q'. Let (M60O;)o:-(J om — 
75) 1(0)/G, (we we )o) be the symplectic point 
reduced space at zero. 


Theorem 4 (Shifting theorem). Under the hypoth- 
eses of the symplectic orbit reduction theorem 
(Theorem 3), the symplectic orbit reduced space 
Mo,» the point reduced spaces M,, and (M © OÑ )o 
are symplectically diffeomorphic. 


Singular Reduction 


In the previous section we carried out symplectic 
reduction for free and proper actions. The freeness 
guarantees via the bifurcation lemma that the 
momentum map J is a submersion and hence the 
level sets J7* (u) are smooth manifolds. Freeness and 
properness ensure that the orbit spaces 
M,,:=J (u)/G, are regular quotient manifolds. 
The theory of singular reduction studies the proper- 
ties of the orbit space M,, when the hypothesis on 
the freeness of the action is dropped. The main 
result in this situation shows that these quotients are 
symplectic Whitney stratified spaces, in the sense 
that the strata are symplectic manifolds in a very 
natural way; moreover, the local properties of this 
Whitney stratification make it into what is called a 
cone space. This statement is referred to as the 
“symplectic stratification theorem" and adapts to 
the symplectic symmetric context the stratification 
theorem of the orbit space of a proper Lie group 
action by using its orbit type manifolds. In order to 
present this result, we review the necessary defini- 
tions and results on stratified spaces (see Singularity 
and Bifurcation Theory for more information on 
singularity theory). 


Stratified Spaces 


Let Z be a locally finite partition of the topological 
space P into smooth manifolds S; C P,; € I. We 
assume that the manifolds $; C P,i € I, with their 
manifold topology are locally closed topological sub- 
spaces of P. The pair (P, Z) is a decomposition of P with 
pieces in Z when the following condition is satisfied: 


Condition (DS) If R, $ € Z are such that RNS 40, 
then RCS. In this case we write R < S. If, in 
addition, R Z S we say that R is incident to S or that 
it Is a boundary piece of S and write R < S. 


The above condition is called the frontier condition 
and the pair (P, Z) is called a decomposed space. The 
dimension of P is defined as dim P — sup(dim S; | S; € 
Z}. If k € N, the k-skeleton P* of P is the union of all 
the pieces of dimension smaller than or equal to k; its 
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topology is the relative topology induced by P. The 
depth dp(z) of any z € (P, Z) is defined as 


dp(z) = sup{k EN | 4So,S1,...,8, € Z 
with z € Sy « $1 <->» « S4) 


Since for any two elements x, y € S in the same piece 
S € P we have dp(x) — dp(y), the depth dp(S) of the 
piece $ is well defined by dp(S):=dp(x),x € S. 
Finally, the depth dp(P) of (P,Z) is defined by 
dp(P) :— sup(dp(S) | S € 2}. 

A continuous mapping f:P— Q between the 
decomposed spaces (P, Z) and (O, Y) is a morphism 
of decomposed spaces if, for every piece S € Z, there 
is a piece T € such that f(S) C T and the 
restriction f|,: S— T is smooth. If (P, Z) and (P,7) 
are two decompositions of the same topological 
space we say that Z is coarser than 7 or that 7 is 
finer than Z if the identity mapping (P, 7) — (P, Z) 
is a morphism of decomposed spaces. A topological 
subspace O C P is a decomposed subspace of (P, Z) 
if, for all pieces S € Z, the intersection SMO is a 
submanifold of S and the corresponding partition 
ZO forms a decomposition of O. 

Let P be a topological space and z € P. Two subsets 
A and B of P are said to be equivalent at z if there is an 
open neighborhood U of z such that ANU=BNU. 
This relation constitutes an equivalence relation on the 
power set of P. The class of all sets equivalent to a 
given subset A at z will be denoted by [A], and called 
the set germ of A at z. If A C B C P, we say that [A], is 
a subgerm of [B],, and denote [A], C [B],. 

A stratification of the topological space P is a map 
S that associates to any z € P the set germ S(z) of a 
closed subset of P such that the following condition 
is satisfied: 


Condition (ST) For every z € P there is a neighbor- 
hood U of z and a decomposition Z of U such that 
for all y € U the germ S(y) coincides with the set 
germ of the piece of Z that contains y. 


The pair (P,S) is called a stratified space. Any 
decomposition of P defines a stratification of P by 
associating to each of its points the set germ of the 
piece in which it is contained. The converse is, by 
definition, locally true. 


The Strata 


Two decompositions Z4 and Z3 of P are said to be 
equivalent if they induce the same stratification of P. 
If Z; and Z are equivalent decompositions of P 
then, for all z € P, we have that dpz, (2) =dpz, (2). 
Any stratified space (P,S) has a unique decomposi- 
tion Zs associated with the following maximality 
property: for any open subset UC P and any 
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decomposition Z of P inducing S over U, the 
restriction of Zs to U is coarser than the restriction 
of Z to U. The decomposition Zs is called the 
canonical decomposition associated to the stratifica- 
tion (P, S). It is often denoted by S and its pieces are 
called the strata of P. The local finiteness of the 
decomposition Zs implies that for any stratum S 
of (P,S) there are only finitely many strata R with 
S < R. Henceforth, the symbol S in the stratification 
(P,S) will denote both the map that associates to 
each point a set germ and the set of pieces associated 
to the canonical decomposition induced by the 
stratification of P. 


Stratified Spaces with Smooth Structure 


Let (P,S) be a stratified space. A singular or 
stratified chart of P is a homeomorphism 
@:U—¢(U) CR” from an open set UCP to a 
subset of R” such that for every stratum SES 
the image @(UMS) is a submanifold of R” and 
the restriction ¢|y,9: UNS—@(UNS) is a diffeo- 
morphism. Two singular charts ó: U — ¢(U) c R” 
and p: V=p(V) CR” are compatible if for any 
z€UnV there exist an open neighborhood 
W c Un V ofz, a natural number N > max {n, m}, 
open neighborhoods O, O' c R of (U) x {0} and 
(V) x {0}, respectively, and a diffeomorphism 
ip: O — O' such that i, 0 pl yw —voi,odo|y, where 
i, and i,, denote the natural embeddings of R" and 
R” into R by using the first n and m coordinates, 
respectively. The notion of singular or stratified 
atlas is the natural generalization for stratifications 
of the concept of atlas existing for smooth mani- 
folds. Analogously, we can talk of compatible and 
maximal stratified atlases. If the stratified space 
(P, S) has a well-defined maximal atlas, then we say 
that this atlas determines a smooth or differentiable 
structure on P. We will refer to (P,S) as a smooth 
stratified space. 


The Whitney Conditions 


Let M be a manifold and R,S C M two submani- 
folds. We say that the pair (R,S) satisfies the 
Whitney condition (A) at the point z€ R if the 
following condition is satisfied: 


Condition (A) For any sequence of points {Zn}neN 
in $ converging to z € R for which the sequence of 
tangent spaces (T, S],- converges in the Grass- 
mann bundle of dim S-dimensional subspaces of TM 
to T C T,M, we have that T;R C 7. 


Let 6: U— R” be a smooth chart of M around 
the point z. The Whitney condition (B) at the point 


z € R with respect to the chart (U, ó) is given by the 
following statement: 


Condition (B) Let {xn} ey C ROU and (y4,es C 
SN U be two sequences with the same limit 
z= lin +, = lim», 
n—=00 n—oc 
and such that x, Æ Yn, for all n € N. Suppose that 
the set of connecting lines ó(x,)ó(y,) C R” con- 
verges in projective space to a line L and that the 
sequence of tangent spaces {Ty,S} en converges in 


the Grassmann bundle of (dim S)-dimensional sub- 
spaces of TM to 7 C T¿M. Then, (T,0) (L) C T. 


If the condition (A) (respectively (B)) is verified 
for every point z € R, the pair (R, S) is said to satisfy 
the Whitney condition (A) (respectively (B)). It can 
be verified that Whitney's condition (B) does not 
depend on the chart used to formulate it. A stratified 
space with smooth structure such that, for every pair 
of strata, Whitney's condition (B) is satisfied is 
called a Whitney space. 


Cone Spaces and Local Triviality 


Let P be a topological space. Consider the equiva- 
lence relation ~ in the product P x [0, 00) given by 
(z,a) ~ (2,a') if and only if a =a' =0. We define the 
cone CP on P as the quotient topological space P x 
[0,00)/= . If P is a smooth manifold then the cone 
CP is a decomposed space with two pieces, namely, 
Px(0,00) and the vertex which is the class 
corresponding to any element of the form (z,0), 
z€P, that is, P x (0). Analogously, if (P, Z) is a 
decomposed (stratified) space then the associated 
cone CP is also a decomposed (stratified) space 
whose pieces (strata) are the vertex and the sets of 
the form S x (0,00), with S € Z. This implies, in 
particular, that dim CP — dim P +1 and dp(CP) — 
dp(P) + 1. 

A stratified space (P, S) is said to be locally trivial 
if for any z € P there exist a neighborhood U of z, a 
stratified space (F,S"), a distinguished point 0 € F, 
and an isomorphism of stratified spaces 


Y: U=(SNU)xF 


where S is the stratum that contains z and w satisfies 
U^ (y, 0)=y, for all y € SN U. When Fis given by a 
cone CL over a compact stratified space L then L is 
called the link of z. 

An important corollary of “Thom's first isotopy 
lemma” guarantees that every Whitney stratified 
space is locally trivial. A converse to this implication 
needs the introduction of cone spaces. Their defini- 
tion is given by recursion on the depth of the space. 


Definition $ Let m € N U {oo,w}. A cone space of 
class C” and depth 0 is the union of countably many 
C" manifolds together with the stratification whose 
strata are the unions of the connected components 
of equal dimension. A cone space of class C" and 
depth d+ 1, d € N, is a stratified space (P, S) with a 
C" differentiable structure such that for any z € P 
there exists a connected neighborhood U of z, a 
compact cone space L of class C" and depth d called 
the link, and a stratified isomorphism 


$:U—(SnU)xCL 


where S is the stratum that contains the point z, the 
map 1 satisfies (y, 0) — y, for all y € SN U, and 0 
is the vertex of the cone CL. 

If m Æ 0 then L is required to be embedded into a 
sphere via a fixed smooth global singular chart 
q:L— S! that determines the smooth structure 
of CL. More specifically, the smooth structure of 
CL is generated by the global chart 7:[z,t] € 
CL ty(z) € R"!. The maps y:U — ($nU) x 
CL and o:L-— S! are referred to as a cone chart 
and a link chart, respectively. Moreover, if m Æ 0 
then 4 and v^! are required to be differentiable of 
class C" as maps between stratified spaces with a 
smooth structure. 


The Symplectic Stratification Theorem 


Let (M,w) be a connected symplectic manifold acted 
canonically and properly upon by a Lie group G. 
Suppose that this action has an associated momen- 
tum map J: M — a* with nonequivariance 1-cocycle 
c:G-—g". Let peg’ be a value of J,G, the 
isotropy subgroup of y with respect to the affine 
action O:G x gq* —g' determined by o, and let 
H C G be an isotropy subgroup of the G-action on 
M. Let Mj, be the connected component of the 
H-isotropy type manifold that contains a given 
element z € M such that J(z) =p and let G,, Mj, be 
its G,-saturation. Then the following hold: 


1. The set J * (uy) à G, Mi, is a submanifold of M. 

2. The set Mi= = D s ) A G,Mi;]/G,, has a unique 
quotient M d structure such that the 
canonical projection ns : J” (u) 0 G, M3, — 
M\") is a surjective submersion. 


3. There is a unique symplectic structure w) 
MOD characterized by 


E) on 


¡(H)x 


¡He y — (A LH) 


7 p "pn 
where i: (1) N G,Mj; — M is the natural 


inclusion. The pairs (Mi, w') will be called 


singular symplectic point strata. 
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4. Let h € C*(M)* be a G-invariant Hamiltonian. 
Then the flow P, of X, leaves the connected 
components of J ! (u) ^ G,, Mi, invariant and com- 
mutes with the G,,-action, so it induces a flow F7 on 
MD that is characterized by mH OB; o tH) — 
E O mE, 

5 The flow F is Hamiltonian on Mm, with 
reduced Hamiltonian function 57: Mi? >R 
defined by hP” ont!) =h o i). The vector fields 
X, and Xp | are iH related. 

6. Let : MR be another G-invariant function. 
Then (5, k} is also G-invariant and {h, ETE e (BE, 
kin ) Mi» where (,] a! denotes the Poisson Bracket 
induced by the symplectic structure on MP), 


Theorem 6 (Symplectic stratification theorem). The 
quotient M, = Ju/G, is a cone space when 
considered as a stratified space with strata MY” 


As was the case for regular reduction, this theorem 
can also be formulated from the orbit reduction point 
of view. Using that approach one can conclude 
that the orbit reduced spaces Mo, are cone 
spaces symplectically stratified by the manifolds 


Mb, :=G- (J '(u)NM3,)/G that have symplectic 
structure uniquely determined by the expression 

-(H)x + "s 

lo, w = To, wp, + Jo, w 


O, 


where ies G- (J> (u) n Mi, ) —^ M is the inclusion, 
e ri (1) n Mi) + O, is obtained By restric- 
tion of da sabias dan map J, and WO, is the 
(+)-symplectic form on O,,. Analogous statements 
to (7)-(6) above with obvious modifications are valid. 


See also: Cotangent Bundle Reduction; Dynamical 
Systems in Mathematical Physics: An Illustration 

from Water Waves; Graded Poisson Algebras; 
Hamiltonian Group Actions; Lie Groups: General Theory; 
Poisson Reduction; Singularity and Bifurcation Theory; 
Symmetries and Conservation Laws. 
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Introduction 


Spontaneous symmetry breaking in its simplest form 
occurs when there is a symmetry of a dynamical 
system that is not manifest in its ground state or 
equilibrium state. It is a common feature of many 
classical and quantum systems. In quantum field 
theories, in the infinite-volume limit, there are new 
features, the appearance of unitarily inequivalent 
representations of the canonical commutation 
relations, and the possibility of a true phase 
transition — a point in the phase space where the 
thermodynamic free energy is nonanalytic. The 
spontaneous breaking of a continuous global sym- 
metry implies the existence of massless particles, the 
Goldstone bosons, while in the local-symmetry case 
some or all of these may be eliminated by the Higgs 
mechanism. Spontaneous symmetry breaking in 
gauge theories is however a more elusive concept. 


Breaking of Global Symmetries 


In a quantum-mechanical system a (time-independent) 
symmetry is represented by a unitary operator U 
acting on the Hilbert space of quantum states which 
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commutes with the Hamiltonian H. If the ground state 
|0) of the system in not invariant under U, then 
10) = UJ0) Z c|0) is also a ground state. In other 
words, the ground state is degenerate. 

For a system with a finite number of degrees 
of freedom, whose states are represented by vectors 
in a separable Hilbert space H, symmetry breaking 
of an abelian symmetry group G is impossible, 
unless there are additional accidental symmetries. 
Consider, for example, a particle in a double-well 
potential 


V (x? — a?y [1] 


EP" 
which has the discrete symmetry group G = Z5»; the 
inversion symmetry operator U satisfies U? — l. 
There are then two approximate ground states |0) 
and |0’) = U|0), with wave functions proportional to 
exp[-(1/2) mw(x + a)*]. However, there is an over- 
lap between these, and the off-diagonal matrix 
element (0|H|0^) is nonzero, although exponentially 
small, so the true energy eigenstates are, approxi- 
mately, |0+)=(1/v2)(10) +|0’)). (More accurate 
energy eigenfunctions and eigenvalues may be 
found by using the WKB approximation.) 

Of course, if the symmetry group is nonabelian, 
and the ground state belongs to a nontrivial 
representation, then degeneracy is unavoidable. For 
example, if G is the rotation group SO(3) (or SU(2)) 


and the ground state has angular momentum ; Æ 0, 
then it is (2j + 1)-fold degenerate. 

The situation is different, however, in a quantum 
field theory. In the infinite-volume limit, even abelian 
symmetries can be spontaneously broken. Take, for 
example, a real scalar field with Lagrangian 


L= F t39 p0" — V= 19? - ivy adi i [2] 


(where we set c—5b — 1), again with a double-well 
potential 


V= =y [3] 
which 


exhibiting a Za under 
O(x)+ —ól(x). 

At least in the semiclassical or tree approxi- 
mation, there are two degenerate vacuum states |0) 


and |0^, with 
(0|ó(x)|0) zy and 


symmetry 


(0'|a(x)|0') = —m A 


If we quantize the system in a box of finite volume 
V, then, as earlier, there is an off-diagonal matrix 
element of the Hamiltonian connecting the two 
states, so the true ground state is (approximately) 
(1//2)(|0) + |0’)). However, this matrix element 
goes to zero exponentially as Y — 0. Even for large 
but finite volume, the rate of transitions from |0) to 
10) is exponentially slow. 

Similarly, we can consider a complex scalar field 
theory with a sombrero potential: 


= |a - |Vof* - V 
3 2 
V= (lo = w) 


This model is invariant under the U(1) group of phase 
transformations, d(x)= ó(x)e'^, so we now have a 
continuously infinite set of degenerate vacuum states 
|0,) labeled by an angle a, and satisfying 


[5] 


pi 1 ia 
)|0.) "Mam 
Once again, one finds that in the infinite-volume 
limit there are no matrix elements connecting the 
different vacuum states. Moreover, in this limit no 
polynomial formed from the field operators ó(x) in 
a finite volume can have nonzero matrix elements 
between |0,) and |03) for a Z 8. Applying the 
operators ó(x) to any one of these vacuum states 
|04,), we can construct a Fock space Ha, and the 
representations of the canonical commutation rela- 
tions on these separate Hilbert spaces are unitarily 
inequivalent. Formally, we can introduce operators 
U,, that perform the symmetry transformations: 


U.o(x)U," = ó(x)e^ [7] 


(0, | (x [6] 
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However, these are not unitary operators on the 
spaces Hy, but rather maps from one space to 
another: U, : Hg — 714,5 — or, alternatively, opera- 
tors on the nonseparable Hilbert space H= O), Ha. 

So far, our discussion has been restricted to the 
tree approximation. For a full quantum treatment, 
V(@) must be replaced by the effective potential 

V.lo), which may be defined as the minimum value 
of the mean energy density in all states in which the 
field ó has the uniform expectation value (ó(x)) = ó. 
Ve may be computed by summing vacuum loop 
diagrams. 

A point to note is that although the degenerate 
vacua |0,) are mathematically distinct, in the 
absence of any external definition of phase, they 
are physically identical. There is no internal obser- 
vational test that will distinguish them. 


Symmetry-Breaking Phase Transitions 


Spontaneous symmetry breaking often occurs in the 
context of a phase transition. At high temperature, 
T >n, there are large fluctuations in $ and the 
central hump of the potential is unimportant. Then 
the equilibrium state is symmetric, with ($)-— 0. 
However, as the temperature falls, it becomes less 
probable that the field will fluctuate over the top of 
the hump. It will tend to fall into the trough, and 
acquire a nonzero average value (4) — the order 
parameter for the phase transition — thus breaking 
the symmetry. The direction of symmetry breaking 
(e.g., the phase of ó in the U(1) model) is random, 
determined in practice by small preexisting fluctua- 
tions or interactions with the environment. 

One way of studying this process is to compute 
the  temperature-dependent effective potential 
Velo, T). In the one-loop approximation, at high 
temperature, the leading corrections to the zero- 
temperature effective potential V.g(ó, T) are of the 
form 


Te 
Verd, T) = Vere(, 0) — gg NT 
l id 2 
*t34 M. UT + O(T) [8] 


where N, is the total number of helicity states of light 
particles (those with masses <T), and M?, which 
depends on ó, is the sum of their squared masses. 
(Fermions if present contribute to N, with a factor of 
7/8 and to M? with a factor of 1/2.) In the simplest 
case, where we have only a multiplet @ = (ġa)a=1,...N 
of real scalar fields, N,=N and M?-M$, 
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(summation over a implied), where the mass-squared 
matrix is 
2 Vv 
m 062004 


[9] 


For example, in an O(N) theory, with V = (1/8) 
Mg? — i. Y, where $? = $,,, one has 


M7, = ŁA? — 177) ban + Apae [10] 
whence 
1 T^ 
Vetil, T) AG? — m)" — ¿NT? 
+ FATIN +2)? — Nn?) [11] 


It is then easy to see that the minimum occurs at 
ġ=0 for T >Te, where in this approximation 
T = 12: /(N +2), while below the critical tem- 
perature the minimum is at 


sere 
12 


As T — 0, the equilibrium state approaches one of 
the vacuum states |0,), labeled by an N-dimensional 
unit vector n, such that (0,,|6|0,) =m. 

It is often convenient to introduce a classical 
symmetry-breaking potential. For example, in the 
O(N) model, we may take Vp — —j-ó(x), where j 
is a constant N-vector. This has the effect of tilting the 
potential, thus removing the degeneracy. A character- 
istic of spontaneous symmetry breaking is that the 
limits 7 — 0 and V — oo do not commute. If (for 
T < T.) we take the infinite-volume limit first, and 
then let j — 0, we get different equilibrium states, 
depending on the direction from which j approaches 
zero; if we fix n and let j=jn, j — 0, then we find 


lim lim (@(%))n = Peg (T)n [13] 


Y” = ¢.,(T) = Y T* [12] 


We may also regard j as representing an interac- 
tion with the external environment (e.g., other 
fields). If such a term is present during the cooling 
of the system through the phase transition, it will 
constrain the direction of the spontaneous symmetry 
breaking. Note that one always arrives in this way 
at one of the degenerate vacua |0,), not a linear 
combination of them. 


Goldstone Bosons 


The Goldstone theorem states that spontaneous 
breaking of any continuous global symmetry leads 
inevitably (except, as we discuss later, in the 


presence of long-range forces) to the appearance of 
massless modes — the Goldstone bosons. 

The proof is straightforward. Associated with any 
continuous symmetry there is a Noether current 
satisfying the continuity equation 09, — 0 and such 
that infinitesimal symmetry transformations are 
generated by the spatial integral of j°. The fact that 
the symmetry is broken means that there is some 
scalar field ó(x) whose vacuum expectation value 
(0|4(0)|0) is not invariant under the symmetry 
transformation. Hence, 


limi f dèx (Olx) (ao 0 [14 


Moreover, the time derivative of this integral is 
lim i [| OI G9). OO) loo 


= —lim if dS, (0|[7* (x), G(0)]|0)|,0-9 =O [15] 


where OY is the bounding surface of V. This vanishes 
because the surface integral is zero — in a relativistic 
theory, because the commutator vanishes at space- 
like separation, and more generally in the absence of 
long-range interactions because it tends rapidly to 
zero at large spatial separation. 

Now, inserting a complete set of momentum 
eigenstates |n, p) in [14], we can see that there must 
exist states such that (7, p|ó(0)|0) + 0, with p? — 0 
in the limit |p| — 0, that is, massless modes. 

One can see this more directly in the U(1) model 
above. Consider a vacuum state |0) such that 
(0|4|0) =n/V2 is real. Then it is useful to shift the 
origin of ó by writing 


é(x) = A hratt [16 


where yı and y»; are real. Then the Lagrangian 
becomes 


L=} - (Vi +9 - (Veo)? — Ah 
-Anpi (et + v5) - (vi £e) | [17] 


Evidently, the field i, corresponding to radial 
oscillations in $, is massive, with mass vAn. But 
there is no term in v5, so q» is massless. 

In the case of spontaneous symmetry breaking of 
nonabelian symmetries, there may be several Gold- 
stone bosons, one for each broken component of the 
continuous symmetry. In our theory with symmetry 
group G — O(N), the possible values of the vacuum 
expectation value at T —0 are (0,|(0)/0,,) = mtt, 


where n is an arbitrary unit vector. In this case, for 
given n, there is an unbroken symmetry subgroup 


H={REO(N):Rn=n}=O(N—-1) [18] 
and the number of broken symmetries is 
dim G — dim H—- N- 1 [19] 


Thus, the radial component of @ is massive, and 
there are N—1 Goldstone bosons, the N—1 
transverse components. 


Spontaneously Broken Gauge Theories 


As we shall see, symmetry breaking in gauge 
theories is a more problematic concept but, for the 
moment, these complications are ignored and the 
present discussion will continue with an approach 
similar to that used above. 

The simplest local gauge symmetry theory is a 
U(1) Higgs model, a model of a complex scalar field 
ó(x) interacting with a gauge potential A,,(x), 
described by the Lagrangian 


£—D,('D'ó-1F,P"— V(\dl) 20] 


where V is a sombrero potential as in [5], while the 
covariant derivative D,,ó and gauge field F,, are 
given by 

Did = 0,0 + ieA,d, Fa —0,À, = 0,4, [21] 
The model is invariant under the local U(1) gauge 
transformations 


(x) > pajares 
1. [22] 
Ay (x) A,(x) — z uel) 


The Goldstone theorem does not apply to local- 
symmetry theories. The problem is that to have a 
Hilbert space containing only physical states one 
must eliminate the gauge freedom by choosing a 
gauge condition (e.g., in the U(1) case the Coulomb 
gauge 0 A*(x) — 0, which has the effect of restricting 
the number of polarization states of photons to 
two). This necessarily breaks manifest Lorentz 
invariance, although the theory is, of course, still 
fully Lorentz invariant. The proof of the theorem 
fails because the current is no longer local; the long- 
range Coulomb interaction makes the commutator 
fall off only like 1/+?, so the surface integral no 
longer vanishes in the infinite-volume limit. (The 
theorem also fails for nonrelativistic models with 
long-range forces.) 


Symmetry Breaking in Field Theory 201 


Again, consider a vacuum state |0) in which 
(0|0|0) =n/V2, and make the same decomposition, 
[16]. Then, if we set 


i 1 
As = Ay + en 9? [23] 


we find that the kinetic term for %2 has been 
absorbed into a mass term (1/2)e*7°A’,A for the 
vector field. We have a model with only massive 
fields: the “Higgs field” pı with mass VAn and the 
gauge field A', with mass er. The Goldstone bosons 
have been “eaten up” by the vector field to provide 
its longitudinal mode. This is the Higgs mechanism, 
first noted by Anderson in the context of the photon 
in a plasma becoming a massive plasmon. 

A more elegant way of seeing this is to note that 
we can always make a gauge transformation to 
ensure that ó is real (at least so long as 4 4 0; where 
it Is zero, there may be problems). This means that 
d(x) =(1/42)(m + v1); v» disappears altogether, and 
its kinetic term reduces to (1/2)&* A, A" (n E v1)”, 
which includes the mass term for A,, as well as cubic 
and quartic interaction terms. 

As before, the discussion can be generalized to 
nonabelian theories, although there are additional 
problems to be discussed later. If we have a local 
symmetry group G that breaks spontaneously to 
leave an unbroken subgroup H, then the gauge fields 
associated with H remain massless. Each of the 
(dim G — dim H) complementary fields “eats up” 
one of the Goldstone bosons, becoming massive in 
the process. We are left only with other, “radial” 
components of @, the massive Higgs fields. 

Consider, for example, a local SO(3) model, 
with scalar fields 6 —(6,), 1.9.3 and gauge potentials 
A,, = (Aap). The infinitesimal gauge transformations are 


óó = d@ x $, ÍA, =00 X A — 29,50 [24] 


where 6@ is the gauge parameter. The Lagrangian is 
£ —3D,9- D'9 —1F,.F" -IAP — PY" BSI 
where the covariant derivative and gauge field are 


D,,9 = 0,0 + eA, x 0 


[26] 
F,, = 0,A, — 9,A, + eA, x A, 


If we take (9) in the 3-direction, the fields A; , and 
A», absorb the Goldstone fields $1, ¢2 to become 
massive. As in the abelian case, we can use the local 
SO(3) invariance to rotate @ everywhere to the 
3-direction, and write @=(0, 0, 4-3). In this 
gauge the kinetic term (1/2)(eA,, x 0). gives a mass 
er to the fields A1,, A2, while A3,, remains massless, 
and the Higgs field 3 again has mass vAn. 
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Elitzur’s Theorem; the Role of 
Gauge Fixing 


The concept of spontaneous symmetry breaking in 
the context of a local symmetry requires further 
discussion, in particular because of Elitzur’s theo- 
rem, proved in 1975, which states in essence that 
“spontaneous breaking of a local symmetry is 
impossible.” In the light of this theorem, it may 
seem that a “spontaneously broken gauge theory” is 
an oxymoron. In fact, it means something rather 
different, although even that is not unproblematic. 

The theorem was proved in the context of lattice 
gauge theory, where the spatial continuum is 
replaced by a discrete lattice. The scalar field is 
then represented by values 6, at each lattice site, and 
the gauge potential by values Ax, „ on the links of the 
lattice. This is significant because on the lattice one 
can use a manifestly gauge-invariant formalism. 
Expectation values of gauge-invariant physical 
variables can be found, for example, by a Monte 
Carlo algorithm that effectively averages over all 
possible gauges. In this context, it is possible to 
show that the expectation value of any gauge- 
noninvariant operator (such as @,) necessarily 
vanishes identically. 

To be more specific, suppose we incorporate a 
symmetry-breaking term of the form —j - $`, @,, and 
consider the limits Y — oc followed by f — 0. In the 
global-symmetry case, as we noted earlier, this yields 
the nonzero result [13]. However, in the case of a 
local gauge symmetry, one can show rigorously that 


^ 


lim lim (@,) 


¡0 V> 


jn =0 [27] 
The essential reason for this is that we can make a 
gauge transformation in the neighborhood of the 
point x to make @, have any value we like without 
changing the energy by more than a very small 
amount that goes to zero as j— 0. Within this 
manifestly gauge-invariant formalism, it is clear that 
the expectation. value of a gauge-noninvariant 
operator such as $ is not an appropriate order 
parameter. One must instead look for a gauge- 
invariant order parameter. 

It is important to note, however, that this result 
applies only in the context of a manifestly gauge- 
invariant formalism. But, in general, gauge theories 
cannot be quantized in a manifestly gauge-invariant 
way. In a path-integral formalism, the action 
functional, which appears in the exponent, is 
constant along the orbits of the gauge-group action. 
Consequently, the integral contains an infinite 
factor, the volume of the (infinite-dimensional) 
gauge group. There are corresponding divergences 


in the perturbation series. As is well known, this 
problem can be dealt with by introducing a gauge- 
fixing term, which explicitly breaks the gauge 
symmetry, and renders Elitzur's theorem inapplic- 
able. But this procedure leaves a global symmetry 
unbroken, and it is in fact that global symmetry that 
is broken spontaneously. 

One example is the Landau-Ginzburg model of a 
superconductor, which is essentially just the non- 
relativistic limit of the abelian Higgs model, 
although there is one significant difference: here 
the field ó annihilates a Cooper pair, a bound pair 
of electrons with equal and opposite momenta and 
spins, so e above is replaced by the charge 2e of a 
Cooper pair. The appearance of a condensate of 
Cooper pairs in the low-temperature superconduct- 
ing phase corresponds to a state in which (ó) is 
nonzero. This would not be possible without fixing 
a gauge. In the nonrelativistic context, the obvious 
gauge to choose is the Coulomb gauge, defined by 
the condition 9, A^ — 0. This gauge-fixing condition 
breaks the local symmetry explicitly, but it leaves 
unbroken the global symmetry ó(x) — ó(x)e'^ with 
constant a. It is that global symmetry that is 
spontaneously broken when (ó) ¥ 0. 

For a model with nonabelian local symmetry the 
standard procedure used to derive a perturbation 
expansion is that of Faddeev and Popov. Consider, 
for example, the SO(3) gauge theory discussed in the 
preceding section. To fix the gauge, we can choose a 
set of functions F=(F,) of the fields, and introduce 
into the path integral a gauge-fixing term of the form 


] 

Lot = -z [28] 
where £ is an arbitrary real constant. However, to 
ensure that this does not bias the integral, so that the 
gauge-fixed theory is at least formally equivalent to 
the original gauge-invariant theory, one must also 
include the determinant of the Jacobian matrix 


. OP, 
Jan.) = 


[29] 


The easiest way to do this is to introduce Faddeev- 
Popov ghost fields C, C, which are scalar Grassmann 
variables, and an appropriate term in the 
Lagrangian 


£go C6 [30] 


For the SO(3) model, a convenient choice of gauge is 
the Re gauge defined by 


F = 0,A" — enn x 0 [31] 
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where n is an arbitrarily chosen unit vector. It is 
clear that the full Lagrangian L + Lef + Erp is no 
longer invariant under the full SO(3) gauge group, 
although there is a residual U(1) gauge invariance 
corresponding to rotations about n. In this gauge, 
the arbitrary choice of n means that the global 
SO(3) symmetry is also broken. However, for other 
choices, such as the Lorentz gauge F = 9,A" or 
axial gauge F=A3, the Lagrangian is invariant 
under global SO(3) rotations of all the fields. This 
global symmetry is then spontaneously broken, with 
$ acquiring as before a nonzero expectation value of 
the form (@(x)) = nn. 

It is interesting to look again at the particle 
content of this model. By setting @(x) — mn + p(x) 
with m= (0,0, 1), one finds that in the quadratic part 
of the Lagrangian, the cross-terms between A,, and 9 
combine to form a total divergence which can be 
dropped. As before, 3 is the Higgs field, with 
m^ = \n*, Az, is the massless gauge field corres- 
ponding to the unbroken gauge symmetry, and the 
three transverse components of Aj, and A2, 
represent the massive vector fields, with m* = e^. 
There are, however, also unphysical fields with 
€-dependent masses: 1,2, C1,2, C1,2, and the long- 
itudinal components 9, A1 , all have m^ —£&e^r. We 
can now compute the effective potential V.¢(T, 9). 
One point that should be noted in performing this 
calculation is that the ghost fields C, C contribute 
negatively. Obviously, Vo, being £-dependent, is 
not itself physically meaningful. Nevertheless, it can 
be shown that the stationary points of V,g are 
physical, and correspond to the possible equilibrium 
states of the theory. Moreover, the extremal values 
of V,g are independent of € and give correctly the 
thermodynamic potential in the corresponding equi- 
librium states. The negative contributions from the 
ghost fields to N, and M? ensure that the £ 
dependence cancels out, and we find as expected 
N, —9 and M? =(\ + 6e?)g. 


Phase Transitions and Crossovers 


Our discussion so far has for the most part been 
restricted to a semiclassical or mean-field approx- 
imation. It is important to bear in mind, however, 
that this approximation does not suffice to deter- 
mine whether a phase transition (where the thermo- 
dynamic free energy is nonanalytic) exists, or what 
its nature is. Determining the detailed characteristics 
of phase transitions requires other methods, such as 
the renormalization group or lattice simulations. In 
many cases, it is far from trivial to establish the 
order of the transition, or even whether a true phase 
transition actually exists. 


Gauge theories pose particular problems because 
of the infrared divergences in the thermal field 
theory at high temperature, and because in asymp- 
totically free nonabelian theories the coupling 
becomes large at very low energy. Even when they 
appear to exhibit spontaneous symmetry breaking, 
they do not necessarily undergo a true phase 
transition. Lattice gauge theory calculations have 
led to the conclusion that in nonabelian gauge 
theories with the Higgs field in the fundamental 
representation, there are values of the coupling 
constants for which there is no phase transition, 
only a rapid but smooth crossover from one type of 
behavior to another, so that the high- and low- 
temperature phases are analytically connected. If the 
coupling constant is small, there is a first-order 
phase transition, and for moderate values the theory 
exhibits a very rapid crossover that looks quite 
similar to a symmetry-breaking phase transition. 
Nevertheless, the analytic connection between the 
two phases implies that there cannot exist an order 
parameter that is strictly zero above the transition 
and nonzero below it. 

In particular, it appears that for physical values 
of the Higgs mass, the electroweak theory does not 
undergo in fact undergo a true phase transition. It is 
somewhat ironic that the most famous example of a 
spontaneously broken gauge theory probably does 
not, strictly speaking, exhibit a symmetry-breaking 
phase transition! 


Conclusions 


We have discussed the main features of spontaneous 
symmetry breaking in both the global- and local- 
symmetry cases, especially the appearance of Gold- 
stone bosons when a continuous global symmetry 
breaks, and their elimination in the local-symmetry 
case by the Higgs mechanism, as well as the 
problems attaching to the concept of spontaneous 
symmetry breaking in gauge theories. 


See also: Abelian Higgs Vortices; Effective Field 
Theories; Electroweak Theory; Finite Group Symmetry 
Breaking; Lattice Gauge Theory; Noncommutative 
Geometry and the Standard Model; Phase Transitions in 
Continuous Systems; Quantum Central Limit Theorems; 
Quantum Spin Systems; Symmetries in Quantum Field 
Theory of Lower Spacetime Dimensions; Topological 
Defects and their Homotopy Classification. 
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Introduction 


A classification of random matrix ensembles by 
symmetries was first established by Dyson, in an 
influential 1962 paper with the title “the threefold 
way: algebraic structure of symmetry groups and 
ensembles in quantum mechanics." Dyson's three- 
fold way has since become fundamental to various 
areas of theoretical physics, including the statistical 
theory of complex many-body systems, mesoscopic 
physics, disordered electron systems, and the field of 
quantum chaos. 

Over the last decade, a number of random matrix 
ensembles beyond Dyson's classification have come 
to the fore in physics and mathematics. On the 
physics side, these emerged from work on the low- 
energy Dirac spectrum of quantum chromodynamics 
(QCD) and from the mesoscopic physics of low- 
energy quasiparticles in disordered superconductors. 
In the mathematical research area of number theory, 
the study of statistical correlations in the values of 
Riemann zeta and similar functions has prompted 
some of the same generalizations. 

In this article, Dyson's fundamental result will be 
reviewed from a modern perspective, and the recent 
extension of Dyson's threefold way will be moti- 
vated and described. In particular, it will be 
explained why symmetry classes are associated 
with large families of symmetric spaces. 


The Framework 


Random matrices have their physical origin in the 
quantum world, more precisely in the statistical 
theory of strongly interacting many-body systems 
such as atomic nuclei. Although random matrix 
theory is nowadays understood to be of relevance to 


numerous areas of physics — see Random Matrix 
Theory in Physics — quantum mechanics is still 
where many of its applications lie. Quantum 
mechanics also provides a natural framework in 
which to classify random matrix ensembles. 
Following Dyson, the mathematical setting for 
classification consists of two pieces of data: 


e A finite-dimensional complex vector space V with 
a Hermitian scalar product (-,-), called a “unitary 
structure” for short. (In physics applications, 
V will usually be the truncated Hilbert space of 
a family of quantum Hamiltonian systems.) 

e On V there acts a group G of unitary and 
antiunitary operators (the joint symmetry group 
of the multiparameter family of quantum systems). 


Given this setup, one is interested in the linear space 
of self-adjoint operators on V — the Hamiltonians H 
— with the property that they commute with the 
G-action. Such a space is reducible in general, that 
is, the matrix of H decomposes into blocks. The goal 
of classification is to list all of the irreducible blocks 
that occur. 


Symmetry Groups 


Basic to classification is the notion of a symmetry 
group in quantum Hamiltonian systems, a notion 
that will now be explained. 

In classical mechanics, the symmetry group Go of 
a Hamiltonian system is understood to be the group 
of canonical transformations that commute with the 
phase flow of the system. An important example is 
the rotation group for systems in a central field. 

In passing from classical to quantum mechanics, 
one replaces the classical phase space by a quantum- 
mechanical Hilbert space V and assigns to the 
symmetry group Go a (projective) representation by 
unitary C-linear operators on V. Besides the one- 
parameter continuous subgroups, whose significance 
is highlighted by Noether’s theorem, the compo- 
nents of Go not connected with the identity play an 


important role. A prominent example is provided by 
the operator for space reflection. Its eigenspaces are 
the subspaces of states with positive and negative 
parity; these reduce the matrix of any reflection- 
invariant Hamiltonian to two blocks. 

Not all symmetries of a quantum-mechanical 
system are of the canonical, unitary kind: the 
prime counterexample is the operation of inverting 
the time direction, called time reversal for short. In 
classical mechanics, this operation reverses the sign 
of the symplectic structure of phase space; in 
quantum mechanics, its algebraic properties reflect 
the fact that inverting the time direction, t> — t, 
amounts to sending i= /—1 to —i. Indeed, time 1 
enters in the Dirac, Pauli, or Schródinger equation 
as ihd/dt. Therefore, time reversal is represented in 
the quantum theory by an antiunitary operator T, 
which is to say that T is complex antilinear: 


T(zv) =7Iv (ze€GC,ve V) 


and preserves the Hermitian scalar product or 
unitary structure up to complex conjugation: 


(Tuy , Tv2) = (vy , V2) = (v5 ,U1) 


Another operation of this kind is charge conjugation 
in relativistic theories such as the Dirac equation. 

By the symmetry group G of a quantum-mechanical 
system with Hamiltonian H, one then means the group 
of all unitary and antiunitary transformations g of V 
that leave the Hamiltonian invariant: gHg ! = H. We 
denote the unitary subgroup of G by Go, and the set of 
antiunitary operators in G by G, (not a group). If V 
carries extra structure, as will be the case for some 
extensions of Dyson's basic scheme, the action of G on 
V has to be compatible with that structure. 

The set G; may be empty. When it is not, the 
composition of any two elements of G, is unitary, so 
every g € G; can be obtained from a fixed element of 
G4, say T, by right multiplication with some U € Go: 
g — TU. In other words, when G, is nonempty the 
coset space G /G consists of exactly two elements, Go 
and T - Go = G4. We shall assume that T represents 
some inversion symmetry such as time reversal or 
charge conjugation. T must then be a (projective) 
involution, that is, T?—z x Id with z a complex 
number of unit modulus, so that conjugation by T? is 
the identity operation. Since T is complex antilinear, 
the associative law T? - T — T - T? forces z to be real, 
and hence T? = «Id. 

Finding the total symmetry group of a Hamiltonian 
system need not always be straightforward, but this 
complication will not be an issue here: we take the 
symmetry group G and its action on the Hilbert 
space V as fundamental and given, and then ask 
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what are the corresponding symmetry classes, 
meaning the irreducible spaces of Hamiltonians on 
V that commute with G. 

For technical reasons, we assume the group Gg to 
be compact; this is an assumption that covers most 
(if not all) of the cases of interest in physics. The 
noncompact group of space translations can be 
incorporated, if necessary, by wrapping the system 
around a torus, whereby translations are turned into 
compact torus rotations. 

While the primary objects to classify are the 
spaces of Hamiltonians H, we shall focus for 
convenience on the spaces of time evolutions 
U, =e !'H/^ instead. This change of focus results in 
no loss, as the Hamiltonians can always be retrieved 
by linearizing in £ at 1 — 0. 


Symmetric Spaces 


We appropriate a few basic facts from the theory of 
symmetric spaces. 

Let M be a connected m-dimensional Riemannian 
manifold and p a point of M. In some open subset 
N, of a neighborhood of p there exists a map 
Sp: Ny — Np, the geodesic inversion with respect to 
p. which sends a point x € N, with normal 
coordinates (xj,...,x,,) to the point with normal 
coordinates (—x1,..., —x,,). The Riemannian mani- 
fold M is called locally symmetric if the geodesic 
inversion is an isometry, and is called globally 
symmetric if sp extends to an isometry sy: M— M, 
for all p € M. A globally symmetric Riemannian 
manifold is called a symmetric space for short. 

The Riemann curvature tensor of a symmetric 
space is covariantly constant, which leads one to 
distinguish between three cases: the scalar curvature 
can be positive, zero, or negative, and the symmetric 
space is said to be of compact type, Euclidean type, 
or noncompact type, respectively. (In mesoscopic 
physics, each type plays a role: the first provides us 
with the scattering matrices and time evolutions, the 
second with the Hamiltonians, and the third with 
the transfer matrices.) The focus in the current 
article will be on compact type, as it is this type that 
houses the unitary time evolution operators of 
quantum mechanics. The compact symmetric spaces 
are subdivided into two major subtypes, both of 
which occur naturally in the present context, as 
follows. 


Type Il 


Consider first the case where the antiunitary 
component G4 of the symmetry group is empty, so 
the data are (V, G) with G = Go. Let 7(V) denote 
the group of all complex linear transformations that 
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leave the structure of the vector space V invariant. 
Thus, “(V) is a group of unitary transformations if 
V carries no more than the usual Hermitian scalar 
product; and is some subgroup of the unitary group 
if V does have extra structure (as is the case for the 
Nambu space of quasiparticle excitations in a 
superconductor). The symmetry group Go, by acting 
on V and preserving its structure, is contained as a 
subgroup in 7(V). 

Let now H be any Hamiltonian with the pre- 
scribed symmetries. Then the time evolution 
t — U, =e "H/P generated by H is a one-parameter 
subgroup of 7(V) which commutes with the 
Go-action. The total set of transformations U, that 
arise in this way is called the (connected part of the) 
“centralizer” of Go in 7(V), and is denoted by Z. 
This is the *good" set of unitary time evolutions — 
the set compatible with the given symmetries of an 
ensemble of quantum systems. 

The centralizer Z is obviously a group: if U and 
U' belong to Z, then so do their inverses and their 
product. What can one say about the structure of 
the group Z? 

Since Gp is compact by assumption, its group 
action on V is completely reducible and V is 
guaranteed to have an orthogonal decomposition 


Vv=BHv 
A 


where the sum runs over isomorphism classes of 
irreducible Gp-representations A, and the vector 
spaces V, are called the Go-isotypic components of 
V. For example, if Go is the rotation group SO;, the 
Go-isotypic component V, of V is the subspace 
spanned by all the states with total angular 
momentum A. 

Consider now any U € Z. Since U commutes with 
the Go-action, it does not connect different 
Go-isotypic components. (Indeed, in the example of 
SO3-invariant dynamics, angular momentum is 
conserved and transitions between different angular 
momentum sectors are forbidden.) Thus, every 
Go-isotypic component V, is an invariant subspace 
for the action of Z on V, and Z decomposes as 
Z= [[,Za with blocks Z, —Z |y. 

To say more, fix a standard  irreducible 
Go-module R, of isomorphism class A and consider 


La = Home, (Ry, Va) 


the linear space of C-linear maps /:R,— V, that 
intertwine the Go-actions on Ry and V,. An element 
of Ly is called a Go-equivariant homomorphism. By 
Schur’s lemma, Ly & C if Vy is Go-irreducible. More 
generally, dimL, —:7:, counts the multiplicity of 
occurrence of Ry in V5; for example, in the case of 


Go — SO; we take R, to be the standard irreducible 
module of dimension 2\+ 1; and m, then is the 
number of times a multiplet of states with total 
angular momentum À occurs in V4. 

The natural mapping L, Y R1— V) by I & ro l(r) 
is an isomorphism, 


Vy. & Ly & Ry 


and using it we can transfer the entire discussion 
from V, to Lj & Ry. The group Go acts trivially on 
Ly 2 C"^ and irreducibly on Ry. Therefore, the 
component Z, of the centralizer Z is the unitary 


group 
La =~ U(La) => ls 


if V is a unitary vector space with no extra structure. 
In the presence of extra structure (which, by 
compatibility with the Go-action, restricts to every 
subspace Vj), the factor Zi is some subgroup of 
U,,,. In all cases, Z is a direct product of connected 
compact Lie groups Za. 

To make the connection with symmetric spaces, write 
M :— Z5. Since M is a group, the operation of taking 
the inverse, U= U^, makes sense for all U € M. 
Moreover, being a compact Lie group, the manifold M 
admits a left- and right-invariant Riemannian structure 
in which the inversion U — U is an isometry. By 
translation, one gets an isometry sy, : U++ U;U U] 
for every U; € M. All of these maps sy, are globally 
defined, and the restriction of sy, to some neighborhood 
of U, coincides with the geodesic inversion with respect 
to U;. Thus, M is a symmetric space by the definition 
given above. Symmetric spaces of this kind are called 


type Il. 


Type ! 


Consider next the case of G; Z (0, where some 
antiunitary symmetry T is present. As before, let Z 
be the connected component of the centralizer of Go 
in “(V). Conjugation by T, 


U-—T(U):- TUT! 


is an automorphism of z(V) and, owing to T? = 4-Id, 
T is involutive. Because Go CG is a normal 
subgroup, 7 restricts to an involutive automorphism 
(still denoted by 7) of Z. Now recall that T is 
complex antilinear and the good Hamiltonians are 
subject to THT ! — H. The good time evolutions 
U,=e“H/P clearly satisfy 7(U,) = U_;= U,!. Thus, 
the good set to consider is /:=(U € Z| U=7(U)"}. 
The set ^ is a manifold, but in general is not a 
Lie group. 

Further details depend on what 7 does with the 
factorization Z= [|], Za. If V, is a Go-isotypic 


component of V, then so is TV), since T normalizes 
Go. Thus, either Vy N TV, =0, or TV, = Vy. In the 
former case, the involutive automorphism 7 just 
relates U € Z, with 7(U) € Zry,, whence no intrin- 
sic constraint on Za results, and the time evolutions 
(U,rT(U) *) € Zy x Zrv, constitute a type-II sym- 
metric space, as before. 

A novel situation occurs when TV, = Vy, in which 
case 7 restricts to an automorphism of Z). Let 
therefore TV,=V,, put K = Z, for short, and 
consider 


M := {U € K|U = v(U) '} 


Note that if two elements p,po of K are in M, 


then so is the product pop po. The group K acts on 
M c K by 


k-U — kUr(k) ! (keK) 

and this group action is transitive, that is, every U € M 
can be written as U=kr(k)? with some k € K. 
(Finding k for a given U is like taking a square root, 
which is possible since exp : Lie K — K is surjective.) 
There exists such a K-invariant Riemannian structure 
for M that for all po € M the mapping sp, : M — M 
defined by 


Sp, (D) = PoP po 


is the geodesic inversion with respect to po € M. 
Thus, in this natural geometry M is a globally 
symmetric Riemannian manifold and hence a sym- 
metric space. The present kind of symmetric space is 
called type I. If K, is the set of fixed points of 7 in K, 
the symmetric space M is analytically diffeomorphic 
to the coset space K/K, by 


K/K,>MCK, | UK, 5Ur(U) ! 


which we call the “Cartan embedding" of K/K, 
into K. 

In summary, the solution to the problem of 
finding the set of unitary time evolution operators 
that are compatible with a given symmetry group G 
and structure of Hilbert space V is always a 
symmetric space. This is a valuable insight, as 
symmetric spaces are rigid objects and have been 
completely classified by Cartan. 

If the dimension of V is kept variable, the 
irreducible symmetric spaces that occur belong to 
one of the large families listed in Table 1. 


Dyson's Threefold Way 


Recall the goal: given a Hilbert space V and a 
symmetry group G acting on it, one wants to classify 
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Table1 The large families of symmetric spaces. The form of H 
in the header applies to the last seven families 


Famil Symmetric 
y a Form oru- (7. w) 

A Un Complex Hermitian 

Al Un /On Real symmetric 

All Usu /USpow Quaternion self-adjoint 

C USpsy Z complex symmetric, 
W — W! 

CI USp»y/Un Z complex symmetric, 
W=0 

D SOosy Z complex skew, 
W — W! 

DIII SO2m/Un Z complex skew, 
W=0 

Alll Upig/Up x Ug Z complex p x q, W=0 

BDI SOp+q/SOp x SO; Z real px gq, W=0 

Cil USP2p+24/USP2p Z quaternion 

xUSPog 2p x 2q, W=0 


the (irreducible) spaces of time evolution operators 
U that are “compatible” with G, meaning 


U = goUgy' —gU gi 
(for all g, € G,) 


As we have seen, the spaces that arise in this way are 
symmetric spaces of type I or II depending on the 
nature of the time reversal (or other antiunitary 
symmetry) T. 

An even stronger statement can be made when 
more information about the Hilbert space V is 
specified. In Dyson's classification, the Hermitian 
scalar product of V is assumed to be the only 
invariant structure that exists on V. With that 
assumption, only three large families of symmetric 
spaces arise; these correspond to what we call the 
*Wigner-Dyson symmetry classes.” 


Class A 


Recall that in Dyson’s case, the connected part of the 
centralizer of Go in 7(V) is a direct product of 
unitary groups, each factor being associated with one 
Go-isotypic component V, of V. The type-II situation 
occurs when the set G; of antiunitary symmetries is 
either empty or else exchanges different V4. In both 
cases, the set of good time evolution operators 
restricted to one Go-isotypic component V, is a 
unitary group U,,,, with m, being the multiplicity of 
the irreducible Go-representation A in V). 

The unitary groups Ux —,,, Or, to be precise, their 
simple parts SUy, are called type-II symmetric spaces 
of the A family or A series — hence the name class A. 
The Hamiltonians H, the generators of time evolu- 
tions U, —e “"/", in this class are represented by 
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complex Hermitian N x N matrices. By putting a 
Un-invariant Gaussian probability measure 


exp(-trH^/2ec^) dH (c € R) 


on that space, one gets what is called the GUE - the 
Gaussian unitary ensemble - which defines the 
Wigner-Dyson universality class of unitary symmetry. 


Classes Al and All 


Consider next the case G, 4 (0, with antiunitary 
generator T. Let V,— TV, be any Gpo-isotypic 
component of V invariant under T (the type-I 
situation). The mapping U= TUT! — 7(U) then is 
an automorphism of the groups U(V,),Gpy and 
K = Z; S Um,- If K, is the subgroup of fixed points 
of 7r in K, the space of good time evolutions can be 
identified with the symmetric space K/K, by the 
Cartan embedding. Our task is to determine K,. 

To simplify the notation let us write V, = V,R, = 
R, and L, = L. We now ask what happens with 
T:V — V in the process of transfer to LOR S V. 
The answer, so we claim, is that T transfers to a 
pure tensor made from antiunitary maps o: L—L 
and 8: RR, 


Toc 


To prove this claim, let C be the antilinear map 
from V to the dual vector space V* by v+>+(v,-). 
Because the elements of G are represented by 
unitaries, the C-linear operator CT: V — V* inter- 
twines Go-actions: 


CTa(g) 8 "CT (g€ Go) 


where a is the automorphism a(g) - T !gT. From 
the irreducibility of R it follows that the space of 
intertwiners R — R* is one dimensional here (Schur’s 
lemma). Therefore, CT: L & R— L* & R* must be a 
pure tensor (as opposed to a sum of such tensors), 
and since C is clearly a pure tensor, so is T. This 
completes the proof. 

By the involutive property T? — er Idy (er = +1), 
the two antiunitary factors of T —a 8% 8 cannot 
but square to a*=e,Id; and 8? -—ejldg where 
Ea Es = +1 are related by £,£5— eT. The factor a 


determines a nondegenerate complex bilinear form 
Q:LxL-C by 


O(t1,L2) = (ah, D), 


Since oa is antiunitary one has the exchange 
symmetry 


Q(h, l5) = (ah, ah), = ca O(h,h) 


Thus, the complex bilinear form (or pairing) O is 
symmetric for £, = 4-1 and alternating for e, = — 1. 


(Ih. Lb € L) 


Knowing the sign of £a = +1 we know the group 
K,. Indeed, an element k € K, commutes with T and 
after transfer from V to L still commutes with a. But 
since K, is a subgroup of K=U,,,, this means that 
k € K, preserves O. In the case of £a — -- 1, what is 
preserved is a symmetric pairing, and therefore K, S 
O,,,. For €, = —1, the multiplicity m, must be even 
and K, preserves an alternating pairing (or symplec- 
tic structure); in that case K, = USp,,, , the unitary 
symplectic group. 

Thus, there is a dichotomy for the sets of good 
time evolutions M ~ K/K;: 


Class AI: K/K, = Un/On 
Class AII: K/K, = U2n/USp 3x 


(N = mj) 
(2N = my) 


Again we are referring to symmetric spaces by the 
names they — or rather their simple parts SUN /SOn 
and SU2n/USp>,; — have in the Cartan classification. 

In general, there is no immediate means of 
predicting the parity £a, and one has no choice but 
to go through the steps of constructing a. If 
B:R— R happens to be Go-invariant, however, the 
situation simplifies. In that case @ determines a 
Go-invariant pairing R x R — C (in the same way as 
a determines O:L x L—C above). On general 
grounds, an irreducible Go-representation space 
admits at most one such pairing. If that pairing is 
symmetric, then, as we have seen, g= 1; if it is 
alternating, then £5 — —1. The parity £a is given by 
&£geg — ET. 


Example Consider any physical system with spin- 
rotation symmetry (Go —SU;) and time-reversal 
symmetry. The physical operation of time reversal, 
T, commutes with spin rotations and, hence, here 
is a case where the factor 8 in T=a08 8 is 
Go-invariant. On fundamental physics grounds one 
has 7? —(—1)? on states with spin S. The spin-S 
representation of SU; is known to carry an invariant 
pairing which is symmetric or skew depending on 
whether the integer 2S is even or odd. Therefore, 
£r — £g and &, = +1 in all cases. 

Thus, T-invariant systems with no symmetries 
other than energy and spin invariably are class AI. 
By breaking spin-rotation symmetry (Go = {Id}, 
eg= +1) while maintaining T-symmetry for states 
with half-integer spin (say single electrons, which 
carry spin $—1/2), one gets £, — —1, thereby 
realizing class AIl. 


The Hamiltonians By passing to the tangent space 
of K/K, at unity one obtains Hermitian matrices 
with entries that are real numbers (class AI) or real 
quaternions (class AII). When K,-invariant Gaussian 


probability measures (called GOE resp., GSE) are 
put on these spaces, one gets the Wigner-Dyson 
universality classes of orthogonal resp., symplectic 
symmetry. In mesoscopic physics, these are realized 
in disordered metals with time-reversal invariance 
(absence of magnetic fields and magnetic impuri- 
ties). Spin-rotation symmetry is broken by strong 
spin-orbit scatterers such as gold impurities. 


Warning 


The word “symmetry class” is not synonymous with 
“universality class.” Indeed, inside a symmetry class 
many different types of physical behavior are 
possible. For example, random matrix models for 
disordered metallic grains with time-reversal sym- 
metry belong to the symmetry class of the example 
above (class Al), and so do Anderson tight-binding 
models with real hopping. The former are known to 
exhibit energy level statistics of universal GOE type, 
whereas the latter have localized eigenfunctions and 
hence level statistics which is expected to approach 
the Poisson limit when the system size goes to 
infinity. 


Disordered Superconductors 


When Dirac first wrote down his famous equation in 
1928, he assumed that he was writing an equation 
for the wave function of the electron. Later, because 
of the instability caused by negative-energy solu- 
tions, the Dirac equation was reinterpreted (via 
second quantization) as an equation for the ferm- 
ionic field operators of a quantum field theory. A 
similar change of viewpoint is carried out in reverse 
in the Hartree-Fock-Bogoliubov mean-field descrip- 
tion of quasiparticle excitations in superconductors. 
There, one starts from the equations of motion for 
linear superpositions of the electron creation and 
annihilation operators, and reinterprets them as a 
unitary quantum dynamics for what might be called 
the quasiparticle *wave function." 

In both cases — the Dirac equation and the 
quasiparticle dynamics of a superconductor — there 
enters a structure not present in the standard 
quantum mechanics underlying Dyson's classifica- 
tion: the field operators for fermionic particles are 
subject to a set of relations called the “canonical 
anticommutation relations," and these are preserved 
by the quantum dynamics. Therefore, whenever 
second quantization is undone (assuming it can be 
undone) to return from field operators to wave 
functions, the wave-function dynamics is required to 
preserve some extra structure. This puts a linear 
constraint on the good Hamiltonians H. For our 
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purposes, the best viewpoint to take is to attribute 
the extra invariant structure to the Hilbert space V, 
thereby turning it into a Nambu space. 


Nambu Space 


Adopting the standard physics conventions of 
second quantization, consider some set of single- 
particle creation and annihilation operators ci and 
cj, where i=1,2,... labels an orthonormal system 
of single-particle states. Such operators are subject 
to the canonical anticommutation relations (CARs) 


1 To 
C; Cj + Gc; = bij 


[1] 


dd + e = 0 = cic; + cjci 


When written in terms of c; + e! and i(c; — c!), these 
become the standard defining relations of a Clifford 
algebra over R. Field operators are linear combina- 
tions V =} (uc; + fic!) with complex coefficients u; 
and fi. 

Now take H to be some Hamiltonian which is 
quadratic in the creation and annihilation operators: 


i= > Wclc; i ; 2. (Ziel a Ze) 
ij 1] 


and let H act on field operators w by the 
commutator: H - y = [H, y]. The time evolution of 
y is then determined by the Heisenberg equation of 
motion 


. dy 

ib = H-y [2] 
which integrates to w(t) =e"*/P . w(0), and is easily 
verified to preserve the CARs [1]. 

The dynamical equation [2] is equivalent to a 
system of linear differential equations for the 
amplitudes u“; and f;. If these are assembled into 
vectors, and the W;; and Zj into matrices, eqn [2] 


becomes 
y) Uo w) Ca) 


The Hamiltonian matrix on the right-hand side has 
some special properties due to Z;— —Zj; (from 
cic; =—cjc;) and W;- W; (from H being self- 
adjoint as an operator in Fock space). To keep 
track of these properties while imposing some 
unitary and antiunitary symmetries, it is best to put 
everything in invariant form. 

So, let U be the unitary vector space of annihila- 
tion operators “u= >,ujc;, and view the creation 
operators f = 5, fic! as lying in the dual vector space 
U*. The field operators y — 4 4- f then are elements 
of the direct sum U@U*=:V, called “Nambu 


210 Symmetry Classes in Random Matrix Theory 


55 


space." On V there exists a canonical unitary 
structure expressed by 


(y. V) = y un it; + fifi) 


A second canonical structure on V = U Y U* is given 
by the symmetric complex bilinear form 


(v, Vr) = V (fini + fits) = fn) + f) 


where the last expression uses the meaning of f as a 
linear function f: U— C. Note that {w,w} agrees 
with the anticommutator of the field operators, 
vy + yy. 

Now recall that the quantum dynamics is deter- 
mined by a Hamiltonian H that acts on y by the 
commutator H-w=|[H,w]. The one-parameter 
groups £ — e"P/^ generated by this action (the time 
evolutions) preserve the symmetric pairing: 


{y, yi por {eltH/h yy elt H/h yy) 
since the anticommutation relations [1] do not 
change with time. They also preserve the unitar 

g y p y 


structure, 
- itH/by, LtH/b y. 
(y, Vo) = (ey, eth) 


because probability in Nambu space is conserved. 
(Physically speaking, this holds true as long as H is 
quadratic, i.e., many-body interactions are negligible.) 

One can now pose Dyson's question again: given 


Nambu space V and a symmetry group G acting on’ 


it, what is the set of time evolution operators that 
preserve the structure of V and are compatible with 
G? From the section *The framework," we know 
the answer to be some symmetric space, but which 
are the symmetric spaces that occur? 


Class D 


Consider a superconductor with no symmetries in its 
quasiparticle dynamics, so G —(Id]. (A concrete 
example would be a disordered spin-triplet super- 
conductor in the vortex phase.) The time evolutions 
U, — e/'H/^ are then constrained only by invariance 
of the unitary structure and the symmetric pairing 
{,} of Nambu space. These two structures are 
consistent; they are related by particle-hole con- 
jugation C: 


(y, Y; = (Cy, V) 
which is an antiunitary operator with square C? = +d. 
Let Vg C V denote the real vector space of fixed 
points of C. (The field operators in Vp are called of 
“Majorana?” type in physics.) The condition 
(w, w) =(U,yw, Uy} selects a complex orthogonal 


group SO(V), and imposing unitarity yields a real 
orthogonal subgroup SO(Vr) with dim Vg € 2N - 
a symmetric space of the D family. 

When expressed in some basis of Majorana 
fermions (meaning a basis of Vg), the matrix of 
the time evolution generator iH € so(Vr) is real 
skew, and that of H imaginary skew. The simplest 
random matrix model for class D, the SO-invariant 
Gaussian ensemble of imaginary skew matrices, is 
analyzed in the second edition of Mehta's (1991) 
book. From the expressions given by Mehta it is 
seen that the level correlation functions at high 
energy coincide with those of the Wigner-Dyson 
universality class of unitary symmetry. The level 
correlations at low energy, however, show different 
behavior defining a separate universality class. 
This universal behavior at low energies has immedi- 
ate physical relevance, as it is precisely the low- 
energy quasiparticles that determine the thermal 
transport properties of the superconductor at low 
temperatures. 


Class DIII 


Let now magnetic fields and magnetic impurities 
be absent, so that time reversal T is a symmetry of 
the quasiparticle system: G = (Id, T}. Following the 
section “The framework," the set of good time 
evolutions is M = K/K, with K—SO(Vg) and K, 
the set of fixed points of U.— 7(U) - TUT ! in K. 
What is K,? 

The square of the time-reversal operator is T? = —Id 
(for particles with spin 1/2), and commutes with 
particle-hole conjugation C, which makes P :—iCT a 
useful operator to consider. Since C by definition 
commutes with the action of K, and hence also with 
that of K,, the subgroup K, has an equivalent 
description as 


K, =4k € U(V)|k = PRP“ = r(k)) 


The operator P is easily seen to have the following 
properties: (1) P is unitary, (2) P^ — Id, and (3) try 
P—0. Consequently, P possesses two eigenspaces 
V.. of equal dimension, and the condition k = PRP^! 
fixes a subgroup U(V.) x U(V_) of U(V). Since P 
contains a factor i— v/—1 in its definition, it antic- 
ommutes with the antilinear operator T. Therefore, 
the automorphism 7 exchanges U(V,) with U(V_), 
and the fixed-point set K, is the same as U(V,) S 
Uon. Thus, 


M = K/K, = SO4n/Uzn (dim V, = 2N) 


a symmetric space in the DIII family. Note that 
for particles with spin 1/2 the dimension of V, has 
to be even. 


By realizing the algebra of involutions C,T as 
Cy —(loN @1iox)w and Ty=(lw @io,)y, the 
Hamiltonians H in class DIII are brought into the 


standard form 
0 Z 
H= e j 


where the 2N x 2N matrix Z is complex and skew. 


Class C 


Next let the spin of the quasiparticles be 
conserved, as is the case for a spin-singlet super- 
conductor with no spin-orbit scatterers present, and 
let time-reversal invariance be broken by a magnetic 
field. The symmetry group of the quasiparticle 
system then is the spin-rotation group: G = Go = 
Spin; = SU;. 

Nambu space V can be arranged to be a tensor 
product V =L & R so that Gy acts trivially on L and 
by the spinor representation on the spinor space R — 
C?. Since two spinors combine to give a scalar, the 
latter comes with an alternating bilinear form a: R x 
RC. In a suitable basis, the anticommutation 
relations [1] factor on particle-hole and spin indices. 
The symmetric bilinear form {,} of V correspondingly 
factors under the tensor product decomposition 
V—-LoRas 


{h rih & raj = Il I5] x a(ri T2) 


where [,] is an alternating form on L, giving L the 
structure of a complex symplectic vector space. 

The good set M now consists of the time 
evolutions that, in addition to preserving the 
structure of Nambu space, commute with the spin- 
rotation group SU): 


M = (U € U(V)|UC = CU, Vg € SU; : gU = Ug} 


By the last condition, all time evolutions act trivially 
on the factor R. The condition UC= CU, which 
expresses invariance of the symmetric form of V, 
then implies that time evolutions preserve the 
alternating form of L. Time evolutions therefore 
are unitary symplectic transformations of L, hence 
M = USp(L) = USp»y — a symmetric space of the C 
family. The Hamiltonian matrices in class C have 
the standard form 


W Zi 
"(4 


with W being Hermitian and Z complex and 
symmetric. 


Symmetry Classes in Random Matrix Theory 211 


Class CI 


The next class is obtained by taking the time 
reversal T as well as the spin rotations g € SU; to 
be symmetries of the quasiparticle system. 

By arguments that should be familiar by now, the 
set of good time evolutions is a symmetric space 
M = K/K, with K=USp(L) and K, the set of fixed 
points of 7 in K. Once again, the question to be 
answered is: what is K,? The situation here is very 
similar to the one for class DIII, with L and USp(L) 
taking the roles of V and SO(Vp). By adapting the 
previous argument to the present case, one shows 
that K, is the same as U(L,) = Un, where L, is the 
positive eigenspace of P=iCT viewed as a unitary 
operator on L. Thus, 


M = K/K, =~ USp 5, /UNn 


Dirac Fermions: The Chiral Classes 


Three large families of symmetric spaces remain to 
be implemented. Although these, too, occur in 
mesoscopic physics, their most natural realization 
is by 4D Dirac fermions in a random gauge field 
background. 

Consider the Lagrangian » for the Euclidean 
spacetime version of QCD with N, > 3 colors of 
quarks coupled to an SUN, gauge field A,,: 


/ =1WW y" (0, — Ay) yw + imyy 


The massless Dirac operator D=iy"(0, — A,) anti- 
commutes with ys = y°y!y*7°. Therefore, in a basis 
of eigenstates of ys the matrix of D takes the form 


p-(5 1) 3 


If the gauge field carries topological charge v € Z, 
the Dirac operator D has at least |v| zero modes by the 
index theorem. To make a simple model of the 
challenging situation where A,, is distributed according 
to Yang-Mills measure, one takes the matrices Z to be 
complex rectangular, of size p x q with p — q = v, and 
puts a Gaussian probability measure on that space. 
This random matrix model for D captures the 
universal features of the QCD Dirac spectrum in the 
massless limit. 

The exponential of the truncated Dirac operator, 
e'P (where ft is not the time), lies in a space 
equivalent to U,;,/U, x Ug — a symmetric space of 
the AIII family. We therefore say that the universal 
behavior of the QCD Dirac spectrum is that of 
symmetry class AIII. 

But hold on! Why are we entitled to speak of a 
symmetry class here? By definition, symmetries 
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always commute with the Hamiltonian, never do 
they anticommute! (The relation D = —y5D ys is not 
a symmetry in the sense of Dyson, nor is it a 
symmetry in our sense.) 


Class AIlll 


To incorporate the massless QCD Dirac operator 
into the present classification scheme, we adapt it 
to the Nambu space setting. This is done by 
reorganizing the four-component Dirac spinor 
W,V as an eight-component Majorana spinor Y, 
to write 


B dn ; WI" (Oy 


The 8 x 8 matrices I" are real symmetric besides 
satisfying the Clifford relations TT" + TT” = 261”. 
A possible tensor product realization is 


- a y) Y 


12=180,81, 
1”=0,80,81, 


1?" = 6, Oa, Oe, 


The gauge field in this Majorana representation 
is A,=18918 (A — Alay) where AT = (1/2) 
(A, X A5) are the symmetric and skew parts of 
A, € su(N,). 

The operator H-—il"(O,—.;,) is imaginary 
skew, therefore e"" is real orthogonal. This means 
that there exists a Nambu space V with unitary 
structure (,) and symmetric pairing {,}, both of 
which are preserved by the action of e"", No change 
of physical meaning or interpretation is implied by 
the identical rewriting from Dirac D to Majorana H. 
The fact that Dirac fermions are not truly Majorana 
is encoded in a U¡-symmetry He'9 — e"OH gener- 
ated by Q=1 81 &o,. 

Now comes the essential point: since H obeys 
H = —H, the chiral “symmetry” H = —P;HT s with 
Ts=18 0,08 1 can be recast as a true symmetry: 


H=4T;HT; = THT! 


with antilinear T:U++ITs5¥. Thus, the massless 
QCD Dirac operator is indeed associated with a 
symmetry class in the present, post-Dyson sense: 
that is class AIII, realized by self-adjoint operators 


on Nambu space with Dirac U¡-symmetry and an 
antiunitary symmetry T. 


Classes BDI and Cll 


Consider Hamiltonians D still of the form [3] but 
now with matrix entries taken from either the real 
numbers or the real quaternions. Their one-parameter 
groups e'/P belong to two further families of 
symmetric spaces, namely the classes BDI and CII 
of Table 1. These large families are known to be 
realized as symmetry classes by the massless Dirac 
operator with gauge group SU, (for BDI), or with 
fermions in the adjoint representation (for CII). For 
the details we must refer to Verbaarschot's (1994) 
paper and the recent article by Heinzner et al. (2005). 


See also: Classical Groups and Homogeneous Spaces; 
Compact Groups and Their Representations; 
Determinantal Random Fields; Dirac Fields in Gravitation 
and Nonabelian Gauge Theory; Dirac Operator and Dirac 
Field; High T. Superconductor Theory; Integrable 
Systems in Random Matrix Theory; Lie Groups: General 
Theory; Random Matrix Theory in Physics; Random 
Partitions; Supersymmetry Methods in Random Matrix 
Theory; Symmetries and Conservation Laws. 
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Introduction: Chaotic Systems Can 
Synchronize 


Synchronization is a ubiquitous phenomenon char- 
acteristic of many processes in natural systems and 
(nonlinear) science. It has permanently remained an 
objective of intensive research and is today consid- 
ered as one of the basic nonlinear phenomena 
studied in mathematics, physics, engineering, or life 
science. This word has a Greek root, syn = common 
and chronos=time, which means to share the 
common time or to occur at the same time, that is, 
correlation or agreement in time of different 
processes (Boccaletti et al. 2002). Thus, synchroni- 
zation of two dynamical systems generally means 
that one system somehow traces the motion of 
another. Indeed, it is well known that many coupled 
oscillators have the ability to adjust some common 
relation that they have between them due to weak 
interaction, which yields to a situation in which a 
synchronization-like phenomenon takes place. 

The original work on synchronization involved 
periodic oscillators. Indeed, observations of (peri- 
odic) synchronization phenomena in physics go back 
at least as far as C Huygens (1673), who, during his 
experiments on the development of improved pen- 
dulum clocks, discovered that two very weakly 
coupled pendulum clocks become synchronized in 
phase: two clocks hanging from a common support 
(on the same beam of his room) were found to 
oscillate with exactly the same frequency and 
opposite phase due to the (weak) coupling in terms 
of the almost imperceptible oscillations of the beam 
generated by the clocks. 

Since this discovery, periodic synchronization has 
found numerous applications in various domains, 
for instance, in biological systems and living nature 
where synchronization is encountered on different 
levels. Examples range from the modeling of the 
heart to the investigation of the circadian rhythm, 
phase locking of respiration with a mechanical 
ventilator, synchronization of oscillations of human 
insulin secretion and glucose infusion, neuronal 
information processing within a brain area and 
communication between different brain areas. Also, 
synchronization plays an important role in several 
neurological diseases such as epilepsies and patho- 
logical tremors, or in different forms of cooperative 
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behavior of insects, animals, or humans (Pikovsky 
et al, 2001). 

This process may also be encountered in celestial 
mechanics, where it explains the locking of revolu- 
tion period of planets and satellites. 

Its view was strongly broadened with the devel- 
opments in radio engineering and acoustics, due to 
the work of Eccles and Vincent, 1920, who found 
synchronization of a triode generator. Appleton, 
Van der Pol, and Van der Mark, 1922-27, have, 
experimentally and theoretically, extended it and 
worked on radio tube oscillators, where they 
observed entrainment when driving such oscillators 
sinusoidally, that is, the frequency of a generator 
can be synchronized by a weak external signal of a 
slightly different frequency. 

But, even though original notion and theory of 
synchronization implies periodicity of oscillators, 
during the last decades, the notion of synchroniza- 
tion has been generalized to the case of interacting 
chaotic oscillators. Indeed, the discovery of determi- 
nistic chaos introduced new types of oscillating 
systems, namely the chaotic generators. 

Chaotic oscillators are found in many dynamical 
systems of various origins; the behavior of such 
systems is characterized by instability and, as a 
result, limited predictability in time. 

Roughly speaking, a system is chaotic if it is 
deterministic, has a long-term aperiodic behavior, 
and exhibits sensitive dependence on initial condi- 
tions on a closed invariant set (the chaos theory is 
discussed in more detail elsewhere in this encyclo- 
pedia) (see Chaos and Attractors). 

Consequently, for a chaotic system, trajectories 
starting arbitrarily close to each other diverge 
exponentially with time, and quickly become uncor- 
related. It follows that two identical chaotic systems 
cannot synchronize. This means that they cannot 
produce identical chaotic signals, unless they are 
initialized at exactly the same point, which is in 
general physically impossible. Thus, at first sight, 
synchronization of chaotic systems seems to be 
rather surprising because one may intuitively (and 
naively) expect that the sensitive dependence on 
initial conditions would lead to an immediate 
breakdown of any synchronization of coupled 
chaotic systems. This scenario in fact led to the 
belief that chaos is uncontrollable and thus unusa- 
ble. Despite this, in the last decades, the search for 
synchronization has moved to chaotic systems. 
Significant research has been done and, as a result, 
Yamada and Fujisaka (1983), Afraimovich et al. 
(1986), and Pecora and Carroll (1990) showed that 
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two chaotic systems could be synchronized by 
coupling them: synchronization of chaos is actual 
and chaos could then be exploitable. Ever since, 
many researchers have discussed the theory and the 
design or applications of synchronized motion in 
coupled chaotic systems. A broad variety of applica- 
tions has emerged, for example, to increase the 
power of lasers, to synchronize the output of 
electronic circuits, to control oscillations in chemical 
reactions, or to encode electronic messages for 
secure Communications. 

The publication of the seminal paper of Pecora 
and Caroll (1990) had a very strong impact in the 
domain of chaos theory and chaos synchronization, 
and their applications. It had stimulated very intense 
research activities and the related studies continue to 
attract great attention. Many authors have contrib- 
uted to developing this domain, theoretically or 
experimentally (Boccaletti et al. 2002, Pecorra et al. 
1997, references therein). 

However, the special features of chaotic systems 
make it impossible to directly apply the methods 
developed for synchronization of periodic oscilla- 
tors. Moreover, in the topics of coupled chaotic 
systems, many different phenomena, which are 
usually referred to as synchronization, exist and 
have been studied now for over a decade. Thus, 
more precise descriptions of such systems are indeed 
desirable. 

Several different regimes of synchronization have 
been investigated. In the following, the focus will be 
on explaining the essentials on this large topic, 
subdivided into four basic types of synchronization 
of coupled or forced chaotic systems which have 
been found and have received much attention, while 
emphasizing on the first three: 


e identical (or complete) synchronization (IS), 
which is defined as the coincidence of states of 
interacting systems; 

e generalized synchronization (GS), which extends 
the IS phenomenon and implies the presence of 
some functional relation between two coupled 
systems; if this relationship is the identity, we 
recover the IS; 

@ phase synchronization (PS), which means entrain- 
ment of phases of chaotic oscillators, whereas 
their amplitudes remain uncorrelated; and 

e lag synchronization (LS), which appears as a 
coincidence of time-shifted states of two systems. 


Other regimes exist, some of them will be briefly 
pointed out at the end of this article; we also will 
briefly discuss the very relevant issue of the stability 
of synchronous motions. 


Our discussion and examples given here are based 
on unidirectionally continuous systems, most of the 
exposed ideas can be easily extended to discrete 
systems. 

Let us also emphasize that the same year, 1990, 
saw the publication of another seminal paper, by 
Ott, Grebogi, and Yorke (OGY) on the control of 
chaos (Ott et al. 1990). Recently, it has been 
realized that synchronization and control of chaos 
share a common root in nonlinear control theory. 
Both topics were presented by many authors in a 
unified framework. However, synchronization of 
chaos has evolved in its own right, even if it is 
nowadays known as a part of the nonlinear control 
theory. 


Synchronization and Stability 


For the basic master-slave configuration, where an 
autonomous chaotic system (the master) 


dX " 
= = F(X), XER 11) 
drives another system (the slave), 
oY =G(X,¥), Yer" 2) 


synchronization takes place when Y asymptotically 
copies, in a certain manner, a subset X, of X. That 
is, there exists a relation between the two coupled 
systems, which could be a smooth invertible func- 
tion 4, which transforms the trajectories on the 
attractor of a first system into those on the attractor 
of a second system. In other words, if we know, 
after a transient regime, the state of the first system, 
it allows us to predict the state of the second: 
Y(t) 2 v( X(t)). Generally, it is assumed that n > m; 
however, for the sake of easy readability (even if this 
is not a necessary restriction) the case n=m will 
only be considered; thus, X, — X. Henceforth, if we 
denote the difference Y — (X) by X,, in order to 
arrive at a synchronized motion, it is expected that 


IX,||— 0, ast— +00 [3] 


If w is the identity function, the process is called IS. 


Definition of IS System [2] synchronizes with 
system [1], if the set M={(X, Y) € R” x R”, Y =X] 
is an attracting set with a basin of attraction B(M C B) 
such that lim; ||X(t)— Y(t)||=0, for all 
(X(0), Y(0)) € B. 


Thus, this regime corresponds to the situation 
where all the variables of two (or more) coupled 
chaotic systems converge. 


If is not the identity function, the phenomenon 
is more general and is referred to as GS. 


Definition of GS System [2] synchronizes with 
system [1], in the generalized sense, if there exists a 
transformation v:R" — R”, a manifold M= 
(X, Y) e R”, Y=w(X)} and a subset B (M c B), 
such that for all (Xo, Yo) € B, the trajectory based 
on the initial conditions (Xo, Yo) approaches M as 
time goes to infinity. This is explained further in the 
following. 


Henceforth, in the case of IS, eqn [3] above means 
that a certain hyperplane M, called synchronization 
manifold, within R”, is asymptotically stable. 
Consequently, for the sake of synchrony motion, 
we have to prove that the origin of the transverse 
system X , — Y — X is asymptotically stable. That is, 
to prove that the motion transversal to the synchro- 
nization manifold dies out. 

However, significant progress has been made by 
mathematicians and physicists in studying the 
stability of synchronous motions. Two main tools 
are used in the literature for this aim: conditional 
Lyapunov exponents and asymptotic stability. In the 
examples given below, we will essentially formulate 
conditions for synchronization in terms of Lyapunov 
exponents, which play a central role in chaos theory. 
These quantities measure the sensitive dependence 
on initial conditions for a dynamical system and also 
quantify synchronization of chaos. 

The Lyapunov exponents associated with the 
variational equation corresponding to the transverse 
system X: 


— = DEXA |4] 


where DF(X) is the Jacobian of the vector field 
evaluated onto the driving trajectory X, are referred 
to as transverse or conditional Lyapunov exponents 
(CLEs). 

In the case of IS, it appears that the condition L7... < 
0 is sufficient to insure synchronization, where Lè, is 
the largest CLE. Indeed, eqn [4] gives the dynamics of 
the motion transverse to the synchronization manifold; 
therefore, CLEs indicate if this motion dies out or not, 
and hence, whether the synchronization state is stable 
or not. Consequently, if L7... is negative, it insures the 
stability of the synchronized state. This will be best 
explained using two examples below. 

Even if there exist other approaches for studying 
synchronization, one may ask if this condition on 
Lia, is true in general. To answer this question, 
mathematicians have recently formulated it in terms 
of properties of manifolds (or synchronization 


hyperplanes). Some rigorous results on (generalized) 
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synchronization, when the system is smooth, are 
given by Josic (2000). This approach relies on the 
Fenichel theory of normally hyperbolic invariant 
manifolds and quantities that resemble Lyapunov 
exponents, and is referred to as differentiable GS. 
However, many situations correspond to the case 
where, in some region of values of parameters 
coupling, the function v is only continuous but not 
smooth, that is, the graph of y is a complicated 
geometrical object. This kind of synchronization 
is called nonsmooth GS (Afraimovich et al. 2001). 
Furthermore, the mathematical theory of IS often 
assumes the coupled oscillators to be identical, even 
if, in practice, no two oscillators are exact copies of 
each other. This leads to small differences in system 
parameters and then to synchronization errors. 
These errors have been studied by many authors 
(see, e.g., Illing (2002), and references therein). 


Identical Synchronization 


Perhaps the best way to explain synchronization of 
chaos is through IS, also referred to as conventional 
or complete synchronization (Boccaletti et al. 2002). 
It is the simplest form of chaos synchronization and 
generalizes to the complete replacement which is 
explained below. It is also the most typical form of 
chaotic synchronization often observable in two 
identical systems. 

There are various processes leading to synchroni- 
zation; depending on the particular coupling config- 
uration used these processes could be very different. 
So, one has to distinguish between the following two 
main situations, even if they are, in some sense, 
similar: the unidirectional and the bidirectional 
coupling. Indeed, synchronization of chaotic systems 
is often studied for schemes of the form 


= = F(X) +kN(X — Y) 
[5] 
= = G(Y) + kM(X — Y) 


where F and G act in R”, (X, Y) € (IR”)’, is a scalar, 
and M and N are coupling matrices belonging to 
R'"*", If F=G the two subsystems X and Y are 
identical. Moreover, when both matrices are non- 
zero then the coupling is called bidirectional, while 
it is referred to as unidirectional if one is the zero 
matrix, and the other nonzero. 


Constructing Pairs of Synchronized Systems: 
Complete Replacement 


Pecora and Carroll (1990) proposed the use of 
stable subsystems of given chaotic systems to 


216 Synchronization of Chaos 


construct pairs of unidirectionally coupled synchro- 
nizing systems. Since then generalizations of this 
approach have been developed and various meth- 
ods now exist to synchronize systems (Wu 2002, 
Hasler 1998). 

One way to build a couple of synchronized 
systems is then to use the basic construction method 
introduced by Pecora and Carroll, who made an 
important observation. They found that, when they 
make a replica of part of a chaotic system and send 
a system variable from the original system (trans- 
mitter) to drive this replica (receiver), sometimes the 
replica subsystem and the original chaotic one lock 
in their steps and evolve together chaotically in 
synchrony. This method can be described as follows. 
Consider the autonomous n-dimensional dynamical 
system, 


du — 


<= F(u) (6 


divide this system into two subsystems (u= (v, w)), 


as = G(v,w) 

E [7 
dw | H 

dt Tan (v, w) 


where y Shii ayah 30 5 0p 1. fg), Er y 
Fm), and H—(F,,1,...,F,). Next, create a new 
subsystem w identical to the w-subsystem. This 
yields a (2n — m)-dimensional system: 


dv 

dt = G(v, w) 

dw 

DEM os 8 
T = H(v,w) 8 
dw’ i 

dt — H(v, Ww ) 


The first state-variable component v(t) of the (v, w) 
system is then used as the input to the w’-system. 
The coupling is unidirectional and the (v, w) 
subsystem is referred to as the driving (or master) 
system, the w’-subsystem as the response (or slave) 
system. In this context, the following notions and 
results are useful. 


Definition If lim, , 4» ||w’(t) — w(t)|| =0 and w(t) 
continues to remain in step with w(t) in the course 
of the time, the two subsystems are said to be 
synchronized. 


Definition The Lyapunov exponents of the 
response subsystem (w’) for a particular driven 
trajectory v(t) are called CLEs. 


Let w(t) be a chaotic trajectory with initial 
condition w(0), and w'(t) be a trajectory started at 
a nearly point w’(0). The basic idea of the Pecora- 
Carroll approach is to establish the asymptotic 
stability of the solutions of w’-subsystem by means 
of CLEs. They have shown the following result 
(Pecora and Carroll 1990): 


Theorem A necessary and sufficient condition for 
the two subsystems, w andw', to be synchronized is 
that all of the CLEs be negative. 


Note that only a finite number of possible 
decompositions (or couplings) v-w exist; this is 
bounded by the number of different possible 
subsystems, namely N(N — 1)/2. (For a description 
and mathematical analysis of various coupling 
schemes see Wu (2002).) Furthermore, by splitting 
the main system [6] in a different way, (complete) 
synchronization could not exist. Indeed, in general, 
only a few of the possible response subsystems 
possess negative CLEs, and may thus be used to 
implement synchronizing systems using the Pecora- 
Caroll method. In fact, it has been pointed out in the 
literature that in some cases, the CLE criterion is not 
as practical as some other criteria. 

For simplicity, the idea will now be developed on 
the following three-dimensional simple autonomous 
system, which belongs to the class of dynamical 
systems called generalized Lorenz systems (see 
Deriviere and Aziz-Alaoui (2003), and references 
therein): 


x ——9x—9y 
y = —17x — y — xz [9] 
Z=—Z + xy 


(This should be compared with the well-known 
Lorenz system: 


—10x + 10y 
28x — y — xz 


x 
y 
i 


which differs in the signs of various terms and the 
values of coefficients.) From previous observations, 
it was shown that system [9] oscillates chaotically; 
its Lyapunov exponents are --0.601, 0.000, and 
—16.470; it exhibits the chaotic attractor of Figure 1, 
with a three-dimensional feature very similar to that 
of Lorenz attractor (in fact, it satisfies the condition 
z < 0, but in our context it does not matter). 

Let us divide system [9] into two subsystems 
v=x; and w= (y1, Z1) By creating a copy 
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Figure 1 The chaotic attractor of system [9]: x-y and x-z plane projections. 


w =(y2, 22) of the w-subsystem, we obtain the 
following five-dimensional dynamical system: 
X1 = —9x == 9y1 
yı 
Z1 


—17x1 — Yi — X121 
—24 + X1y1 [10] 


y = —17xi — ya — X122 


— 


£5 = =% +y 


In numerical experiments, it was observed that the 
motion quickly results in the two equalities, 
lim; 400 ly2 — yı|=0 and lim;— +æ |z2 — 21] =0, to 
be satisfied, that is, lim, —.4.. ||w’ — 10|| 2 0. These 
equalities persist as the system evolves. Hence, the 
two subsystems w and w” are synchronized. Figure 2 
illustrates this phenomenon. 

It is also easy to verify that the synchronization 
persists even if a slight change in the parameters of 
the system is made. The CLEs of the linearization of 
the system around the synchronous state, the 
negativity of which determines the stability of the 
synchronized solution, are also computed easily. 

Pecora-Carroll similarly built the system [10] by 
using the following steps. Starting with two copies 
of system [9], a signal x(t) is transmitted from the 
first to the second: in the second system all x- 
components are replaced with the signal from the 
first system, that is, x2 is replaced by x; in the 
second system. Finally, the dx2/dt equation is 
eliminated, since it is exactly the same as dx,/dt 
equation, and is superfluous. This then results in 
system [10]. For this reason, Pecora-Carroll called 
this construction a complete replacement. Thus, it is 
natural to think of the x; variable as driving the 
second system, but also to label the first system the 
drive and the second system the response. In fact, 
this method is a particular case of the unidirectional 
coupling method explained below. Note also that 
this method could be modified by using a partial 
substitution approach, in which a response variable 


is replaced with the drive counterpart only in certain 
locations (Pecora et al. 1997). 


Unidirectional IS 


The IS synchronization has also been called as one- 
way diffusive coupling, drive-response coupling, 
master-slave coupling, or negative feedback control. 

System [5], F=G and N —0, becomes unidirec- 
tionally coupled, and reads 


= = F(X) 
E? 11] 
ap = EY) + RM(X — Y) 


M is then a matrix that determines the linear 
combination of X components that will be used 
in the difference, and k determines the strength of 
the coupling (see, for an interesting review on 
this subject, Pecora et al. (1997)). In unidirectional 
synchronization, the evolution of the first system 
(the drive) is unaltered by the coupling, the second 
system (the response) is then constrained to copy the 
dynamics of the first. Let us consider an example 
with two copies of system [9], and for 


100 
M=|0 0 0 [12] 
0 0 0 


that is, by adding a damping term to the first equation 
of the response system, we get a following unidir- 
ectionally coupled system, coupled through a linear 
term k > 0 according to variables x; >: 


X1 = —9x1 = OV) 


yy =-17% —Y — Be 
Z1 = —Z1 +41 
[13] 
X2 = —9x2 — 9y — R(x2 — x1) 
ya = —1/x2 — yo — x322 
2 = —2 + X22 
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(c) 


Figure 2 Complete replacement synchronization. Time series for (a) y(t) and (b) z;(t), /—1,2, in system [10]. The difference 
between the variable of the transmitter and the variable of the receiver asymptotes tends to zero as time progresses, that is, 
synchronization occurs after transients die down. (c) The plot of amplitudes y, against yo, after transients die down, shows a diagonal 
line, which also indicates that the receiver and the transmitter are maintaining synchronization. The plot of z; against zə shows a 


similar behavior. 


For k=0, the two subsystems are uncoupled; for 
k > 0 both subsystems are unidirectionally coupled; 
and for k —+ +00, we recover the complete replace- 
ment coupling scheme explained above. Our numer- 
ical computations yield the optimal value k for the 
synchronization; we found that for k > k=4.999, 
both subsystems of [13] synchronize. That is, 
starting from random initial conditions, and after 
some transient time, system [13] generates the same 
attractor as for system [9] (see Figure 1). Conse- 
quently, all the variables of the coupled chaotic 
subsystems converge: x2 converges to X, y? to yj, 
and z2 to zı (see Figure 3). Thus, the second system 
(the response) is locked to the first one (the drive). 
Alternatively, observation of diagonal lines in 
correlation diagrams, which plot the amplitudes xı 


against x2, yı against y2, and zı against z2, can also 
indicate the occurrence of system synchronization. 

IS was the first for which examples of unidir- 
ectionally coupled chaotic systems were presented. It 
is important for potential applications of chaos 
synchronization in communication systems, or for 
time-series analysis, where the information flow is 
also unidirectional. 


Bidirectional IS 


A second brief example uses a bidirectional (also 
called mutual or two-way) coupling. In this situa- 
tion, in contrast to the unidirectional coupling, both 
drive and response systems are connected in such a 
way that they influence each other’s behavior. Many 
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Figure 3 Time series for x;(t), y;(t), and z;(t)(/ — 1,2) in system [13] for the coupling constant k = 5.0, that is, beyond the threshold 
necessary for synchronization. After transients die down, the two subsystems synchronize perfectly. 


biological or physical systems consist of bidirection- 
ally interacting elements or components; examples 
range from cardiac and respiratory systems to 
coupled lasers with feedback. Let us then take two 
copies of the same system [9] as given above, but 
two-way coupled through a linear constant term k > 
0 according to variables x >: 


xy = —9x4— 994 — R(x —x3) 
yi = —17x1 = 1 = X121 


41 — —21 + X1y1 
[14] 
xa = —Tea — Ty — Re — M1) 
a =— 17x3 — 5 — x32» 
Z2 = —Z2 + X2Y2 


We can get an idea of the onset of synchronization 
by plotting, for example, x; against x2 for various 
values of the coupling-strength parameter k. Our 
numerical computations yield the optimal value k 
for the synchronization: k ~ 2.50 (Figure 4), both 
(Xi, Yi, Zi) subsystems synchronize and system [14] 
also generates the attractor of Figure 1. 


Synchronization manifold and stability Geometri- 
Geometrically, the fact that systems [13] and [14], 
beyond synchronization, generate the same attractor 


as system [9], implies that the attractors of these 
combined drive-response six-dimensional systems 
are confined to a three-dimensional hyperplane (the 
synchronization manifold) defined by Y=X. After 
the synchronization is reached, this manifold is a 
stable submanifold in the full phase space Rf. 
Figure 5 gives an idea of what the geometry of the 
synchronous attractor of system [13] or [14] looks 
like, by exhibiting the projection of the phase space 
R^ onto (x1, y1, y2) subspace. But, one can simi- 
larly plot any combination of variable x;, y;, and 
zi (i—1, 2), and get the same result, since the 
motion, in case of synchronization, is confined to 
the hyperplane defined in R$ by the equalities 
X1 — X2; Yi — y2, and Zi —22. 

This hyperplane is stable since small perturbations 
which take the trajectory off the synchronization 
manifold decay in time. Indeed, as stated earlier, 
CLEs of the linearization of the system around the 
synchronous state could determine the stability of 
the synchronized solution. This leads to requiring 
that the origin of the transverse system, X,, is 
asymptotically stable. To see this, for both systems 
[13] and [14], we then switch to the new set of 
coordinates, X, =Y — X, that is, x, =x2—%1, 
yı =Y2 — yı, and zi; =z2 — zi. The origin (0,0, 0) 
is obviously a fixed point for this transverse system, 
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Figure 4 Illustration of the onset of synchronization of system [14]. (a)-(c) Plots of amplitudes x, against x» for values of the coupling 
parameter k — 0.5, 1.5, 2.8, respectively. The system synchronizes for k > 2.5. (d) Plot, for k —2.8, of the norm N(X) — | — xe|| + 
Iyi — yall + ||zi — Z|] versus t, which shows that the system synchronizes very quickly. 


within the synchronization manifold. Therefore, for 
small deviations from the synchronization manifold, 
this system reduces to a typical variational equation: 


dX, 
dt 


where DF(X) is the Jacobian of the vector field 
evaluated onto the driving trajectory X, that is, 


= DF(X)X, [15] 


da s 
dt 
dy 
“ee | = V 16 
ds Yi | 
dx. 
d£ Zi 


For systems [13] and [14], we obtain 


Gk -9 0 
Y = Y; = —17-z -1 -x [17] 
y x —1 


Figure 5 The motion of synchronized system [13] or [14] takes A = | m 
place on a chaotic attractor which is embedded in the with k;=k for system [13] and &; —2£ for system 


synchronization manifold, that is, the hyperplane defined by [14]. Let us remark that the only difference between 
Xi — Xe, Yı = yo, and zi = Z2. both matrices V; is the coupling k which has a factor 
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Figure 6 The largest transverse Lyapunov exponents Lra as 
a function of coupling strength k, in the unidirectional system [13] 
(solid) and the bidirectional system [14] (dotted). 


2 in the bidirectional case. Figure 6 shows the 
dependence of Laax on k, for both examples of 
unidirectionally and bidirectionally coupling sys- 
tems. L-.. becomes negative as k increases, which 
insures the stability of the synchronized state for 
systems [13] and [14]. 

Let us note that this can also be proved 
analytically as done by Deriviére and Aziz-Alaoui 
(2003) by using a suitable Lyapunov function, and 
using some new extended version of LaSalle invar- 


lance principle. 


Desynchronization motion Synchronization depends 
not only on the coupling strength, but also on the 
vector field and the coupling function. For some 
choice of these quantities, synchronization may 
occur only within a finite range [k,,k>] of coupling 
strength; in such a case a desynchronization phe- 
nomenon occurs. Thus, increasing k beyond the 
critical value kə yields loss of the synchronized 


motion (L+, becomes positive). 


Generalized Synchronization 


Identical chaotic systems synchronize by following the 
same chaotic trajectory. However, real systems are in 
general not identical. For instance, when the para- 
meters of two coupled identical systems do not match, 
or when these coupled systems belong to different 
classes, complete IS may not be expected, because 
there does not exist such an invariant manifold Y — X, 
as for IS. For non-identical systems, the possibility of 
some type of synchronization has been investigated 
(Afraimovich et al. 1986). It was shown that when two 
different systems are coupled with sufficiently strong 
coupling strength, a general synchronous relation 
between their states could exist and it could be 
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expressed by a smooth invertible function, 
Y(t) — v(X(t)). This phenomenon, called GS, is thus a 
relaxed and extended form of IS in non-identical 
systems. 

However, it may also occur for pairs of identical 
systems, for example, for systems having reflection 
symmetry, F(— X)— —F(X). Besides these examples 
of GS, others also exist that exploit symmetries of 
the underlying systems (Parlitz and Kocarev 1999). 

GS was introduced for unidirectionally coupled 
systems by Rulkov et al. (1995). For simplicity, we 
also focus on unidirectionally coupled continuous 
time systems: 


ox = F(X) 
‘ d [18] 
dt = G(Y, u(t)) 
where X e R”, YER”, F:R"—R", G:R” x 
R* —-R”, and u(t) =(u4(t),...,u,(t)) with 


u¡(t) =h¡(X(t, X,)). Two (non-identical) dynamical 
systems are said to be synchronized in a generalized 
sense if there is a continuous function y from the 
phase space of the first to the phase space of the 
second, taking orbits of the first system to orbits of 
the second. 

The main problem is to know when and under 
what conditions system [18] undergoes GS. Many 
authors have addressed this question, and it has been 
shown that asymptotic stability is equally significant 
for this more universal concept (for some theoretical 
results, see Rulkov et al. (1995) and Parlitz and 
Kocarev (1999)). For unidirectionally coupled con- 
tinuous time systems, the following results hold: 


Theorem A necessary and sufficient condition for 
system [18] to be synchronized in the generalized 
sense is that for each u(t)=u(X(t, X,)) tbe system- 
is asymptotically stable. 


When it is not possible to find a Lyapunov function 
in order to use this theorem, one can numerically 
compute the CLEs of the response system, and use the 
following result: 


Theorem The drive and response subsystems of 
system [18] synchronize in the generalized sense iff 
all of the CLEs of the response subsystem are 
negative. 


The definition of y has the advantage that it allows 
the discussion of synchronization of non-identical 
systems and, at the same time, to consider synchroni- 
zation in terms of the property of synchronization 
manifold. Therefore, it is important to study the 
existence of the transformation Y and its nature 
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(continuity, smoothness, ...). Unfortunately, except in 
special cases (Afraimovich et al. 1986), rarely will one 
be able to produce formulas exhibiting the mapping v. 

An example of two unidirectionally coupled 
chaotic systems which synchronize in the generalized 
sense is given below. Consider the following Réssler 
system driven by system [9]: 


x; = —9x4 — 9yi 


yi = —17xi — Y] — X121 
Zi = —Z1 + X1y1 
[19] 
AA 
X2 = —Ya — Za — k(x — (x1 + y4)) 


ya = x2 +0.2y2 — k(y2 — (y? +2?) 


za = 0.2 + 22 (x2 — 9.0) — k(za — (x? + 22)) 
As shown in Figure 7, it appears impossible to tell 
what the relation is between the transmitter sub- 
system (x1, y1,21) in eqn [19] and the two Rossler 
response subsystems (x2, y?, 22) at k = 1 and k = 100. 
However, GS occurs for large values of the 
coupling-strength parameter k. Therefore, for such 
values we expect that orbits of [19] will lie in the 
vicinity of a certain synchronization manifold. 
Indeed, let us define the set 


S =((01,y1,21,%2,Y2,22) ER”: x3 2 x] +97, 
$, 9 
y2 =y] 2j, 22 =x] zi) 


Since the projections of § onto the coordinates 
(x1, Y1, x2), (Y1, zi, Y2), and (xi, zi, 22) are parabo- 
loids, we can see how the synchronization manifold 
is approached. This is illustrated in Figure 8, where 
the (x1, y1, x2) projections of typical trajectories are 
shown at four different coupling values. (See Josic 
(2000) for other examples and further develop- 
ments; see also Pecora et al. (1997), where the 
authors summarize a method in order to get an idea 


(a) (b) 


on the functional relation occurring in case of GS, 
between two coupled systems.) 


Phase Synchronization 


For coupled non-identical chaotic systems, other 
types of synchronizations exist. Recently, a rather 
weak degree of synchronization, the PS, of chaotic 
systems has been described (Pikovsky et al. 2001). 
The Greek meaning of the word synchronization, 
mentioned in the introduction, is closely related to 
this type of processes. The synchronous motion is 
actually not visible. Indeed, in PS the phases of 
chaotic systems with PS are locked, that is, there 
exists a certain relation between them, whereas the 
amplitudes vary chaotically and are practically 
uncorrelated. Thus, it is mostly close to synchroni- 
zation of periodic oscillators. 


Definition PS of two coupled chaotic oscillators 
occurs if, for arbitrary integers n and m, the phase 
locking condition between the corresponding 
phases, |7@,(t) — mo2(t)| € constant, holds and the 
amplitudes of both systems remain uncorrelated. 


Let us note that such a phenomenon occurs when 
a zero Lyapunov exponent of the response system 
becomes negative, while, as explained above, iden- 
tical chaotic systems synchronize by following the 
same chaotic trajectory, when their largest trans- 
verse Lyapunov exponent of the synchronized 
manifold decreases from positive to negative values. 

Moreover, following the definition above, this 
phenomenon is best observed when a well-defined 
phase variable can be identified in both coupled 
systems. This can be done for strange attractors that 
spiral around a “hole,” or a particular (fixed) point 
in a two-dimensional projection of the attractor. The 
typical example is given by the Róssler system, which, 
for some range of parameters, exhibits a Móbius- 
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Figure 7 Projections onto the (x-y) plane of typical trajectories of system [19]. (a) (x1, y1) projection, that is, a typical trajectory of 
system [9]; (b) and (c) (xe, yo) projections at, respectively, k — 1 and k= 100. 


(c) 
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(d) 


Figure 8 Generalized synchronization. (x1, Y1, X2) projections of typical trajectories of system [19] after transients die out, with 
(a) k — 1, (b) k —20, (c) k — 100, and (d) k — 200. For the last value, the attractor lies in the set S, three-dimensional projections of 


which are paraboloids. 


strip-like chaotic attractor with a central hole. In such 
a case, a phase angle ó(t) can be defined that decreases 
or increases monotonically. For an illustration, we 
take the following two coupled Róssler oscillators: 

x1 = —01Y1 = zi + R(x2 — x1) 

yı = 04x, + 0.175; 

zı = 0.2 + zı (xı — 9.0) 


[20] 
x2 = —02Y2 — 22 + k(x1 — x2) 
y? = a2x2 + 0.17y 
£5 = 0.2 + 22(x5 — 9.0) 
with a small parameter mismatch o1,» = 


0.95 + 0.04,k governs the strength of coupling. 
If we can define a Poincaré section surface for 
the system, then, for each piece of a trajectory 
between two cross sections with this surface, we 
define the phase, as done in Pikovsky et al. (2001), 
as a piecewise linear function of time, so that the 
phase increment is 27 at each rotation: 


I = Ta 


ó(t) — 2m +2, in StS tart 


In] — tn 
where t, is the time of the mth crossing of the secant 
surface. 

In our example, the last has been chosen as the 
negative x-axis and represented by the wide segment 
in Figure 9a. This definition of phases is clearly 
ambiguous since it depends on the choice of the 
Poincaré section; nevertheless, defined in this way, 


the phase has a physically important property, it 
does correspond to the direction with the zero 
Lyapunov exponent in the phase space, its perturba- 
tions neither grow nor decay in time. Figure 9c 
shows that there is a transition from the nonsyn- 
chronous phase regime, where the phase difference 
increases almost linearly with time (R=0.01 and 
k — 0.05), to a synchronous state, where the relation 
|o1(t) — d2(t)| < constant holds (k=0.1), that is, 
the phase difference does not grow with time. 
However, the amplitudes are obviously uncorrelated 
as seen in Figure 9b. This example shows that 
PS could takes place for weaker degree of synchro- 
nization in chaotic systems. Readers can find more 
rigorous mathematical discussion on this subject, 
and on the definition of phases of chaotic oscillators, 
in Pikovsky et al. (2001), see also Boccaletti et al. 
(2002) and references therein. 


Other Treatments and Types 
of Synchronization 


Lag Synchronization 


PS synchronization occurs when non-identical chao- 
tic oscillators are weakly coupled: the phases are 
locked, while the amplitudes remain uncorrelated. 
When the coupling strength becomes larger, some 
relationships between amplitudes may be estab- 
lished. Indeed, it has been shown (Rosenblum et al. 
1997), in symmetrically coupled non-identical oscil- 
lators and in time-delayed systems, that there exists 
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Figure 9 (a) Rossler chaotic attractor projection onto x-y plane. (b) Amplitudes A; versus A» for the phase synchronized case at 
k — 0.1. (c) Time serie of phase difference for different coupling strengths k; for k — 0.01 PS is not achieved, while for k — 0.1 PS takes 
place. Although the phases are locked, for k — 0.1, the amplitudes remain chaotic and uncorrelated. 


a regime of LS. This process appears as a coin- 
cidence of time-shifted states of two systems: 


lim ||Y(r) - X(t — 7)| = 0 
where 7 is a positive delay. 


Projective Synchronization 


In coupled partially linear systems, it was reported 
by Mainieri and Rehacek (1999) that two identical 
systems could be synchronized up to a scaling factor. 
This type of chaotic synchronization is referred to as 
projective synchronization. Consider, for example, a 
three-dimensional chaotic system X — F(X), where 
X —(x,y,z). Decompose X into a vector v — (x, y) 
and a scalar z; the system can then be rewritten as 


du dz 
dr; 7 82); q;^ P viz) 


In projective synchronization, two identical sys- 
tems X;—(x1i,y1,21) (drive) and X5 — (x», y», z2) 
(response) are coupled through the scalar variable z. 
It occurs if the state vectors v; and v? synchronize up 
to a constant ratio, that is, lim, ,.4 ||avi(£) — 
v3(t)| =0, where « is called a scaling factor. For 
partially linear systems, it may automatically occur 


provided that the systems satisfy some stability 
conditions. 

However, this process could not be classified as 
GS, even if there exists a linear relation between the 
coupled systems, because the response system of 
projective synchronization is not asymptotically 
stable. For more information about this subject, 
the reader is referred to Mainieri and Rehacek 
(1999). 


Anticipating Synchronization 


It is interesting to mention that a new form of 
synchronization has recently appeared, the so-called 
anticipating synchronization (Boccaletti et al. 2002). 
It shows that some coupled chaotic systems might 
synchronize such that their response anticipates the 
drivers by synchronizing with their future states. 

It is also interesting to mention the nonlinear H% 
synchronization method for  nonautonomous 
schemes introduced by Suykens et al. (1997). 


Spatio-Temporal Synchronization 


Low-dimensional systems have rather limited useful- 
ness in modeling real-world applications. This is 
why the synchronization of chaos has been carried 


out in high dimensions (see Kocarev et al. (1997) for 
a review). See also Chen and Dong (2001) for a 
discussion of special high-dimensional systems, 
namely large arrays of coupled chaotic systems. 


Application to Transmission Systems 
and Secure Communication 


Synchronization principles are useful in practical 
applications. Use of chaotic signals to transmit 
information has been a very active research topic 
in the last decade. Thus, it has been established that 
chaotic circuits may be used to transmit information 
by synchronization. As a result, several proposals 
for secure-communication schemes have been 
advanced (see, e.g., Cuomo et al. (1993), Hasler 
(1998), and Parlitz et al. (1999)). The first labora- 
tory demonstration of a secure-communication 
system, which uses a chaotic signal for masking 
purposes, and which exploits the chaotic synchroni- 
zation techniques to recover the signal, was reported 
by Kocarev et al. (1992). 

It is difficult, within the scope of this article, to 
give a complete or detailed discussion, and it should 
be noted that there exist many competing and tested 
methods that are well established. 

The main idea of the communication schemes is 
to encode a message by means of a chaotic 
dynamical system (the transmitter), and to decode 
it using a second dynamical system (the receiver) 
that synchronizes with the first. In general, secure- 
communication applications assume additionally 
that the coupled systems used are identical. 

Different methods can be used to hide the useful 
information, for example, chaotic masking, chaotic 
switching, or direct chaotic modulation (Hasler 
1998). For instance, in the chaotic masking method, 
an analog information carrying the signal s(t) is 
added to the output y(t) of the chaotic system in the 
transmitter. The receiver tries to synchronize with 
component y(t) of the transmitted signal s(t) + y(t). 
If synchronization takes place, the information 
signal can be retrieved by subtraction (Figure 10). 

It is interesting to note that, in all proposed 
schemes for secure communications using the idea of 
synchronization (experimental realization or com- 
puter simulation), there is an inevitable noise 
degrading the fidelity of the original message. 


; t) S(t) 
l pen i — e Receiver SEREG 
information Transmitted Retrieved 
signal signal information 
(chaotic) signal 


Figure 10 A typical communication setup. 
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Robustness to parameter mismatch was addressed 
by many authors (Illing et al. 2002). Lozi et al. 
(1993) showed that, by connecting two identical 
receivers in cascade, a significant amount of the 
noise can be reduced, thereby allowing the recovery 
of a much higher quality signal. 

Furthermore, different implementations of chaotic 
secure communication have been proposed during 
the last decades, as well as methods for cracking this 
encoding. The methods used to crack such a chaotic 
encoding make use of the low dimensionality of the 
chaotic attractors. Indeed, since the properties of 
low-dimensional chaotic systems with one positive 
Lyapunov exponent can be reconstructed by analyz- 
ing the signal, such as through the delay-time 
reconstruction methods, it seems unlikely that these 
systems might provide a secure encryption method. 
The hidden message can often be retrieved easily by 
an eavesdropper without using the receiver. But, 
chaotic masking and encoding are difficult to break, 
using the state-of-the-art analysis tools, if suffi- 
ciently high dimensional chaos generators with 
multiple positive Lyapunov exponents (i.e., hyperch- 
aotic systems) are used (see Pecora et al. (1997), and 
references therein). 


Conclusion 


In spite of the essential progress in theoretical and 
experimental studies, synchronization of chaotic 
systems continues to be a topic of active investiga- 
tions and will certainly continue to have a broad 
impact in the future. Theory of synchronization 
remains a challenging problem of nonlinear 
science. 


See also: Bifurcations of Periodic Orbits; Chaos and 
Attractors; Fractal Dimensions in Dynamics; Generic 
Properties of Dynamical Systems; Isochronous Systems; 
Lyapunov Exponents and Strange Attractors; Singularity 
and Bifurcation Theory; Stability Theory and KAM; 
Weakly Coupled Oscillators. 
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Introduction 


Quantum field theory was initially invented in order 
to describe high-energy elementary particles, thereby 
unifying quantum mechanics and special relativity. 
In other words, quantum field theory was addressed 
to the so-called vacuum sector, that is, roughly 
speaking physics at zero temperature and zero 
particle density. 

The same applies to the various mathematically 
rigorous versions of quantum field theory that have 
been developed since the mid-1950s. Indeed, in 
Wightman’s axiomatic setting, quantum field theory 
is describes in terms of a set of the so-called vacuum 
expectation values. The “algebraic approach” to 
quantum field theory developed by Araki, Haag, 
Kastler, and their collaborators is more flexible in 
nature. In fact, right from the beginning, the new 
algebraic tools were successfully applied to lattice 
models and other nonrelativistic systems with 
infinitely many degrees of freedom (see Operator 
algebras and quantum statistical mechanics by 
O Bratteli and D W Robinson). But the need to 
treat large systems of relativistic particles was 
apparently not felt. Even in Haag’s recent mono- 
graph, Local Quantum Physics, the subjects of 
algebraic quantum field theory and algebraic quan- 
tum statistical mechanics are treated separately. 

It is remarkable that constructive field theory 
was ahead of its time in this respect. The famous 
P(ó), model (first constructed by Glimm and Jaffe) 
was adapted to thermal states by Hoegh-Krohn 
as early as 1974 (see Hoegh-Krohn (1974)). 
His paper was properly named “Relativistic quan- 
tum statistical mechanics in two-dimensional 


space-time," but only recently has it received 
proper attention. 

At the same time, around 1974, cosmology and 
heavy-ion collisions drew the interest of phyiscists 
towards the quantum statistical mechanics of hot 
relativistic quantum systems. Well-known papers 
from this early stage include those by Weinberg, 
Bernard, and Dolan and Jackiw. While most of the 
papers used Euclidean path integrals, Umezawa and his 
school developed a real-time framework called 
“thermo-field dynamics," which involved a doubling 
of the degrees of freedom. The excellent review by 
Landsman and van Weert (1987) covers these early 
attempts; it also explains the basic connection to the 
algebraic approach. 

In the following years, it became evident that 
statistical mechanics (in its standard formulation) is 
barely sufficient to derive the properties of bulk 
matter from the underlying microscopic description 
provided by quantum field theory. Thus, various 
people began to establish mathematically rigorous 
foundations for the description of thermal field 
theory. The most successful approach was launched 
by D Buchholz (with various collaborators), who, 
from about 1985 onwards, started applying the 
KMS condition (which describes a thermal equili- 
brium state in the operator-algebraic framework of 
local quantum physics) to relativistic quantum field 
theory. In 1994, Buchholz and Bros managed to 
integrate the holomorphic structure of Wightman 
field theory into Haag’s operator-algebraic frame- 
work, which led them to the notion of a relativistic 
KMS condition. 

The advanced mathematical concepts involved in 
the formulation of entropy densities for thermal 
quantum fields (see Narnhofer (1994)) do not allow 
us to present this topic. The reader is referred to the 
excellent book Quantum Entropy and Its Use by 
M Ohya and D Petz for an introduction to the 
subject. A discussion of the so-called thermalization 
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effects that occur as a result of a curved spacetime is 
provided in Quantum Field Theory in Curved 
Spacetime. Another subject, which is missing almost 
completely, is perturbation theory. This subject has 
been covered extensively in three well-known text- 
books by Kapusta, Le Bellac, and Umezawa. 


Observables and States 


Following Heisenberg, we start from the basic 
assumption that quantum theory can be formulated 
in terms of observables which form an algebra A, that 
is, a vector space with a (noncommutative) multi- 
plication law. Although our emphasis on the abstract 
algebraic structure may look strange, there is a 
profound reason for starting out with an abstract 
algebra of observables: as soon as one considers 
systems with infinitely many degrees of freedom, one 
encounters a possibility to realize the abstract elements 
of the algebra A as operators on a Hilbert space in 
various inequivalent ways. The famous equivalence 
between the Heisenberg and the Schródinger picture 
simply breaks down. States which are macroscopically 
different (e.g., thermal equilibrium states for different 
temperatures) give rise — in a natural way, which will 
be discussed in the sequel - to unitarily inequivalent 
representations of the abstract algebra of observables 
A, while states which only differ microscopically can 
be accommodated by density matrices within the same 
Hilbert space. In other words, a physical state is 
described macroscopically by specifying a representa- 
tion, and microscopically by a density matrix in this 
representation. 

In a Lagrangian approach, the algebra of obser- 
vables A may be thought of as being generated by 
the underlying fields, currents, etc. This leads to the 
so-called polynomial algebras. It is mathematically 
convenient to assume that A is an algebra of 
bounded operators, generated by the bounded 
functions of the underlying quantum fields. If (x) 
is any such field and if f € S(R^'!) is any real test 
function with support in a bounded region of 
spacetime, then the corresponding operator 


wif) =exp(i f dx fixat) 


would be a typical element of A. The set of 
operators (W(f) | supp f C O} will generate a sub- 
algebra A(O) of A. The underlying fields can be 
recovered by taking (functional) derivatives, once a 
representation of A on a Hilbert space is specified. 
The spacetime symmetry of Minkowski space 
manifests itself in the existence of a representation 


a : (A,x) a,x € Aut(A), (A,x) € P! 


of the (orthochronous) Poincaré group P|. Here 
œA x is an automorphism of A, that is, a mapping 
from .A to .A which preserves the algebraic structure. 
Once a Lorentz frame is fixed by choosing a timelike 
vector e € V,, the time evolution t> 41; will be 
denoted by t> 7. 

For the free field, the group of automorphisms 
(A, x) — a4. is defined by 


o x(W(f)) = W(f(A C — x))) 


As before, f € S(R4*!) is a Schwarz function over 
the Minkowski space R^*!. 

While the invariance of the equations of motion is 
reflected in the existence of a representation of the 
Poincaré group in terms of automorphisms in the 
Heisenberg picture, at least the invariance with 
respect to Lorentz boosts is spontaneously broken 
in the Schródinger picture for a thermal equilibrium 
state. 

The usual notions of vector states and density 
matrices associated with a given Hilbert space 
(usually Fock space) are a priori not general enough 
to cover all cases of interest in thermal field theory. 
The following algebraic definition of a state sub- 
stantially generalizes the notion of a state: A state w 
is a positive, linear, and normalized functional, that 
is, a linear map w:.A — C such that 


w(a'a) 20 and w(1)=1 


Once a state w is distinguished on physical grounds, 
the GNS reconstruction theorem provides a Hilbert 
space H,, and a representation m, of A, that is, a 
map from A to the set of bounded operators B(H.,), 
which preserves the algebraic relations. 

It is instructive to consider the GNS representa- 
tion of the Pauli matrices [c9 = 1,01, 02,03}. Given a 
state (a diagonal 2 x 2 matrix p with positive entries 
and trp=1), the left regular representation (a 
construction well known from group theory) 


(oj) |y/p >=lenfp >, 1=0,1,2, 3 


defines a reducible representation on C^, unless one 
of the entries in the diagonal of p is zero (which 
corresponds to a pure state). In the latter case, the 
GNS Hilbert space is C^. By construction, 


< Jp ito) /p > = tr poi 1— 1,2,3. 


Thermal Equilibrium 


The variety of nonequilibrium states ranges from 
mild perturbations of equilibrium states through 
steady states, whose properties are governed 
by external heat baths, or hydrodynamic flows 
up to totally chaotic states which no longer 


admit a description in terms of thermodynamic 
notions. Buchholz et al. (2002) have initiated an 
investigation of nonequilibrium states that are 
locally (but not globally) close to thermal equili- 
brium. Unfortunately, we will not be able to cover 
this topic. Instead, we will concentrate on states 
which deviate from a true equilibrium state only 
microscopically. 


Characterization of Thermal Equilibrium States 


When the time evolution t> 7 € Aut(.A) is changed 
by a local perturbation, which is slowly switched on 
and slowly switched off again, then an equilibrium 
state w returns to its original form at the end of this 
procedure. This heuristic condition of adiabatic 
invariance can be expressed by the stability 
requirement 


t 
lim dt w(la,(b))) -0 Va,be A [1] 

Oo Jt 
In a pioniering work Haag, Kastler, and Trych- 
Pohlmeyer showed that the characterization [1] of 
an equilibrium state leads to a sharp mathematical 
criterion, first encountered by Haag, Hugenholtz, 
and Winnink and more implicitly by Kubo, Martin, 

and Schwinger: 


Definition 1 A state wz over A is called a KMS 
state for some 8 > 0, if for all a, b € A, there exists a 
function F, p which is continuous in the strip 0 < 
Sz < B and analytic and bounded in the open strip 
0 < Sz < B, with boundary values given by 


F, p(t) = wg(ar(b)) and 
F, p(t +10) = wa(ri(b)a) 


Before we start analyzing the properties of KMS 
states, we should mention an alternative character- 
ization of thermal equilibrium states: passivity. The 
amount of work a cycle can perform when applied 
to a moving thermodynamic equilibrium state is 
bounded by the amount of work an ideal windmill 
or turbine could perform; this property is called 
semipassivity (Kuckert 2002): a state w is called 
semipassive (passive) if there is an “efficiency 
bound" E » 0 (E = 0) such that 


- (Wu, H,WQ,) < E- (Wu, P, WQ,) 
VW E (A) 


with W= W*, [Has W] e m,(A)”, and [P,, W] € 
Tul A)”. Here (H,,, P,,) denote the generators imple- 
menting the spacetime translations in the GNS 
representation (Hu, Qw, Tu). Generalizing the notion 
of complete passivity, the state w is called completely 
semipassive if all its finite tensorial powers are 


Vte R [2] 
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semipassive with respect to one fixed efficiency 
bound E. It has been shown by Kuckert (2002) 
that a state is completely semipassive in all inertial 
frames if and only if it is completely passive in some 
inertial frame. The latter implies that w is a KMS 
state or a ground state (a result due to Pusz and 
Woronowicz). 

Let us now turn to properties of thermal 
equilibrium states which are specific for relativistic 
models. It was first recognized by Bros and 
Buchholz (1994) that KMS states of a relativistic 
theory have stronger analyticity properties in con- 
figuration space than those imposed by the tradi- 
tional KMS condition: 


Definition 2 A KMS state wy satisfies the relativis- 
tic KMS condition (Bros and Buchholz 1994) if there 
exists a unit vector e in the forward light cone V, 
such that for every pair of local elements a, b of .A 
the function F; y 


Eb (x1 , x2) = Wg (Cx, (ajax (b)) 


extends to an analytic function in the tube domain 
-T peja X T pajas where 7T5,;—(ízeC|SzeV,.n 
(Ge/2 — V4.)]. 


The relativistic KMS condition can be understood 
as a remnant of the relativistic spectrum condition in 
the vacuum sector. It has been rigorously established 
(Bros and Bruchholz 1994) for the KMS states 
constructed by Buchholz and Junglas (1989) and by 
C Gérard and the author for the P(ó); model. In the 
thermal Wightman framework (Bros and Buchholz 
1996) it has been shown that the relativistic KMS 
condition implies existence of model-independent 
analyticity properties of thermal z-point functions. 
These properties also appear in perturbative compu- 
tations of the thermal Wightman functions 
(Steinmann 1995). 

We now turn to the properties of the set of KMS 
states. For given f, the convex set $5 of all KMS 
states is known to form a simplex; the extreme 
points in the set S5 are called extremal KMS states. 
As a consequence, the extremal states in Sg can be 
distinguished with the help of “classical” (central) 
observables, that is, by observables which commute 
with all other observables. 

If w is an extremal KMS state and y is an 
automorphism which commutes with the time 
evolution £ — 7, then the state w defined by 

w(a) :=w(y(a)), ae A 
is again an extremal KMS state to the same 
parameter values. If “%uw, one says that the 
symmetry is spontaneously broken. 
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Lorentz invariance with respect to boosts is 
always broken by a KMS state, since the KMS 
condition distinguishes a rest frame. A KMS state 
might also break spatial translation or rotation 
invariance. However, by averaging over the different 
configurations one can usually construct a transla- 
tion- and rotation-invariant state. The situation is 
drastically different with respect to supersymmetry. 
Buchholz and Ojima (1997) have shown that super- 
symmetry is broken in any thermal state and it is 
impossible to proceed from it by “symmetrization” 
to states on which an action of supercharges can be 


defined. 


Existence of Thermal Equilibrium States 


Buchholz and Junglas (1989) demonstrated that the 
existence of KMS states can be guaranteed for a 
large class of quantum field-theoretic models. The 
basic assumption to be met concerns the phase-space 
properties of the model. A generalized trace norm 
(the so-called “nuclear norm") is used to estimate 
the “number” of degrees of freedom in phase space. 

The first step is to construction a subspace H(A) 
of the vacuum Hilbert space Hyac., which represents 
excitations of the vacuum strictly localized inside of 
a bounded spacetime region ©. Due to the strong 
correlations present in the vacuum state of any 
relativistic model, as a consequence of the Reeh- 
Schlieder property (see the section “Analyticity of n- 
point functions") this is a delicate procedure, which 
involves the so-called *split property." This property 
ensures that there exists a product vector 7 in 
vacuum Hilbert space Hyac. such that 


(7, Tyac.(@b)n) = wyac. (a) - Wyac.(B) 
Va € A(O), be A(O) [3] 


Here O c Ó denotes a slightly smaller open space- 
time region (such that the closure Ó is inside the 
interior of Ô) and A(O)y':— (A € A | [A, B] 20 VB € 
A(Ó)). The existence of a product vector can be 
ensured if the nuclear norm satisfies some mild 
bounds which are expected to hold in all models of 
physical interest. Given a product vector 7 which 
satisfies [3], the sought after subspace is 


H(A) = Tyac. (A(O) PD 


The crucial step in the proof of existence of KMS 
states is to show that 


trE(AjePPE(A)<oo for B>0 


if the nuclearity condition holds. Here E(A) denotes 
the projection onto the subspace H(A) representing 
localized excitations and H denotes the Hamiltonian 


in the vacuum representation 7yac,. Next it is shown 
that the function 


t+ wu, (an(b)) 


" zu E(A)e-?" E(A)ra. (ar (b)) 

allows an analytic extension to a strip of width 8 
which satisfies the KMS boundary condition [2] for 
It| < ô if a,b € A(O.) and O, + te C O for |t| < 6. In 
the final step, Buchholz and Junglas were able to 
demonstrate that bounds on the nuclear norm are 
even sufficient to control the thermodynamic limit. 

Given a thermal field theory, a slight variation of 
the method used by Buchholz and Junglas allows 
one to construct a KMS state for a new temperature 
(Jakel 2004), that is, to change the temperature of a 
thermal state. 


Thermal Representations 


Given a KMS state wy, the GNS construction gives 
rise to a Hilbert space Hg and a representation mg, 
called a thermal representation, of A. The algebra 
Ra:=ma( A)” possesses a cyclic (due to the GNS 
construction) and separating (due to the KMS 
condition) vector (25 such that 


wg(a) = (Q5, 7 3(a)Q4g) Va € A 


The KMS condition implies that wg is invariant 
under time translations, that is, wg o 7; —w; for all 


t € R. Thus, 


U(t)rg(a)Qa —5(n(a))05, ac A 


defines a strongly continuous unitary group 
(U(t))l.eg implementing the time evolution in the 
representation 75. By Stone’s theorem there exists a 
self-adjoint generator L such that 


U(t) e", teR [4] 


For 0€ 8 «oo, the Liouville operator L is not 
bounded from below; its spectrum is symmetric and 
consists typically of the whole real line. However, 
the negative part of L is *suppressed" with respect 
to the algebra of observables Rg:=7,3(A)” in the 
following sense (Haag 1992): let 1; 4, ,4 be the 
spectral projection of L for the interval |— oo, — k] C 
Sp(L), then 


1135 .4AQs|| € EPA] VA € Rg 


We now turn to structural aspects which are 
characteristic for a relativistic model, namely the 
existence of strong spatial correlations and the 
connection between the decay of these correlations 
and the spectral properties of the Liouville operator. 


Let wg be a state, which satisfies the relativistic 
KMS condition. It follows (using a theorem of 
Glaser) that for a € A the function ®,:R* > Ha, 


x > 73(ox(a))) 


can be analytically continued from the real axis into 
the domain 7 5,5 such that it is weakly continuous 
for Sz «0. If the usual additivity assumption 
U;O; =O => V¡Rg(O¡) — Ra(O) for the local von 
Neumann algebras holds, then 


Hp = Tal A(O))0)5 [5] 


for any open spacetime region O c R^'!, Junglas 
has shown that the thermal Reeh-Schlieder property 
[5] follows as well from the standard KMS condi- 
tion, if wg is locally normal with respect to the 
vacuum representation. 

The decay of spatial correlations depends on 
infrared properties of the model, and the essential 
ingredients for the following cluster theorem are the 
continuity properties of the spectrum of L near zero. 


Theorem 3 Let Q5 denote tbe unique (up to a 
phase) normalized eigenvector with eigenvalue {0} of 
the Liouvillean L and let P^ denote the projection 
onto the strictly positive part of the spectrum of L. 
Assume that there exist positive constants m > 0 
and C¡(O) > 0 such that 


le Pt g(a) Qa 


< C1(O)- A "lal Va € A(O) 


Here OCR is an open and bounded spacetime 
region. Now consider two spacelike separated 
spacetime regions O,,O0O2, which can be embedded 
into O by translation and such that O,+6eC 
O5,6 >> B. then, for a € A(O1) and b € A(O;), 


lwa(ba) — wa(byus(a)| € Ca - 6 "|a ||| 


The constant C3(8,O) € R> may depend on the 
temperature 8 and the size of the region O but is 
independent of 6,a, and b. 


From explicit calculations one expects that 
m= 1/2 for free massless bosons in 3 + 1 spacetime 
dimensions. Consequently, the exponent given on 
the right-hand side is optimal since it is well known 
that in this case the correlations decay only like $”. 

A description of thermal representations would be 
inadequate without pointing out one of the deepest 
connections between pure mathematics and physics 
that emerged in the last century: consider a von 
Neumann algebra R which possesses a cyclic and 
separating vector 2. Then polar decomposition of 
the closeable operator $: AQ — A*Q, A € R, pro- 
vides an antiunitary operator / (the modular 
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conjugation) and a self-adjoint operator A!/*. The 
connection to physics was established independently 
by Takesaki and Winnink, showing that the pair 
(FR, c) satisfies the KMS condition for 8 — — 1, if one 
sets o,(A) =AYAA TE for AER. 

Taking advantage of the Reeh-Schlieder property 
[5], one can associate modular objects to certain 
spacetime regions Ó. In general, a physical inter- 
pretation of these modular objects is missing. But for 
two-dimensional thermal models, which factorize in 
light-cone coordinates, the modular group corre- 
sponding to the algebra of a spacelike wedge admits 
a simple description: at large distances (compared to 
8) from the boundary, the flow pattern is essentially 
the same as time translations. These are results due 
to Borchers and Yngvason (1999). 


Analyticity Properties of n-Point Functions 


The correlation functions describe the full physical 
content of the theory: all observable quantities can 
in principle be derived from them. This is so because 
according to the Wightman reconstruction theorem 
(which is closely related to the GNS construction) 
knowledge of the correlation functions allows the 
reconstruction of the full representation of the field 
algebra. The Wightman distributions WS"), ens, 


Ww? (t; — 4X3, — Eye = PE X — Xn-1) 
= (23, dg(t, x1) ++ 3(ty, Xn) 03) [6] 
where 73(W(f)) =: exp(i f dt dx f (t, x)dg(t, x)), satisfy 


a number of key properties: locality, positivity, 
Poincaré covariance, and temperedness. These prop- 
erties have been formulated for thermal field by Bros 
and Buchholz (1996), and this section is entirely 
based on their work. 

The relativistic KMS condition implies that the 
Wightman distributions (wy aen of a translation- 
invariant equilibrium state admit in the correspond- 
ing set of spacetime variables (ty — t1,x2 — x1),..., 
(ty — tj 1, X4 — X4 1) an analytic continuation into 
the union of domains 


(ay T ge) JE rone ui (Og 1T ge) 


fot oy > 0,i—1,...,.,5-—1 and pe aj=1. The 
tube domains 74. were specified in Definition 2. 
For 3 — oc, the tube Tg, tends to the vacuum tube 
Ta. =R y iV,; thus, one recovers the spectrum 
condition for the vacuum expectation values. 

Let us now turn to the Fourier transformed 
Wightman correlation functions. Translation invar- 
lance implies 


~ 


WS QasBis-- Vus Pn) +--+ + Un JÓ (Pr +++ + Pr) 
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The Wightman distribution WY" satisfies on the 
linear manifold (vi,p1) +---+(%,Pn)=0 the KMS 
relation in the energy variables: for any pair of 
multi-indices (I,J) the identity 


Wy? (J, D) =e  W (I, J) 


holds, where wr !J,I) is an abbreviation for 
Wy" (Pilier (Pjlicy) and vi = 5v; ii 

We now specialize to the two-point function Wọ. 
The corresponding commutation function C(x) is 
given by 


C(x1 = x2) = WS? (x1, %2) = WS (x2, x1) 


Locality implies that suppC Cc V,UV_. The 
retarded and the advanced propagator r and a, 
formally given by 


r(x) = 10(xs)Clx), 


satisfy the relation 


a(x) = —10(—x.)C(x) 


r—a=-iC 


which corresponds to a partition of the support of 
C in its convex components: supprC V, and 
suppa C V_. For the free scalar field of mass m 
the commutator function is 


1 à; 
ce as n -ipx im) 
(2) = s | dee nem) 
with 
1 
Xm) uH D A 
C" (p) = —-sgn(v)8G — p? — m?) 


and subsequently the retarded and advanced propa- 
gators 1”) and al”) are structural functions of the 
field algebra, which are determined by the c-number 
commutation relations of the fields. Thus, they are 
independent of the temperature, in contrast to the 
two-point function: 


(m) 
eux C (b) 
W3 (p) = Icer [7] 
Let now 7(p) be the Fourier transform of the time- 
ordered function 7(x). The relation 


: —ir(p) + —ia(p)e ?" 
sp) = FO) + il 


shows that 7(p) and —ir(p) only “coincide up to an 
exponential tail" at very high energies (Bros and 
Buchholz 1996). 


Particle Aspects 


The condition of locality (together with the relati- 
vistic KMS condition) leads to strong constraints on 


the general form of the thermal two-point functions 
that allow one to apply the techniques of the Jost- 
Lehmann-Dyson representation. As has been shown 
by Bros and Buchholz (1996), the interacting two- 
point function WV; can be represented in the form 


Waltz) = f dm Da(x, m) WẸ (t,x, m) 
0 


Here Dg(x,m) is a distribution in x,m which is 
symmetric in x, and 


WS (t, x, m) = (22) ! J dvdp é Py? (v, p) 


is the two-point correlation function of the free 
thermal field of mass m. In contrast to the vacuum 
case, the damping factors Da(x,m) depend in a 
nontrivial way on the spatial variables x. The 
damping factors describe the dissipative effects of 
the thermal system on the propagation of sharply 
localized excitations. Bros and Buchholz suggested 
that the damping factor D(x, m) can be decom- 
posed into a discrete and an absolute continuous 
part 


Dg(x, m) = 6(m — mo)Dsa(x) + Dac(x, m) 


and that the -contribution in the damping factors is 
due to stable constituent particles of mass mo out of 
which the thermal states are formed, whereas the 
collective quasiparticle-like excitations only contri- 
bute to the continuous part of the damping factors 
(Bros and Buchholz 1996). 

In the case of spontaneously broken internal 
symmetries Bros and Buchholz (1998) have shown 
that the damping factors D¿(x,m) which appear in 
the representation of current-field correlations 
functions 


(Qy, jo(t, x)óg(0, 0)€3) 
- / dm ( D} (x, mð (1,x,m) 
0 


+ Dz (x, m)W (t, x, m)) 


f 
indeed contain a discrete (in the sense of measures) 
zero-mass contribution and are slowly decreasing in 
|x| for small values of m. Thus, these damping 
factors coincide locally with the Kallén—Lehmann 
weights appearing in the case of spontaneous 
symmetry breaking in the vacuum sector (Bros and 
Buchholz 1998). It is easily seen in examples that 
there is no sharp energy-momentum dispersion law 
for the Goldstone particles. Thus, the Killén- 
Lehmann representation is better suited than Fourier 
transformation to uncover the particle aspects of 
thermal equilibrium states. 


Models of Thermal Field Theory 


In the simplest case, the classical Lagrangian density 
of the so-called P(ó); models is given by 


A 
L= (0,6)(0"$) — m^ — Fo" [8] 


Here ó(t,x) denotes a real scalar field over space- 
time. The construction of the corresponding quan- 
tized thermal field presented in this section (Gérard 
and Jákel 2005) is based on the original ideas of 
Hoegh-Krohn (1974). 


Free Fields 


Let b,, denote the L?-closure of CX (R) with respect to 
the norm ||f|| 2 (f, (1/2e)f), where «(k) = v k? + m? 
denotes the one-particle energy for a single neutral 
scalar boson and the scalar product is the usual 
L?-scalar product. The subspaces associated to a 
double cone Ó are given by 


b, (O) :— (b € b, |supptb,supp» ! Sb c O) 


where O denotes the basis of the double cone O. 
The corresponding free quantum field is described by 
the Weyl algebra W(5,,) :—- (W(f) |f € Bp}, together 
with the time evolution {T? heg; 


Te (W(f)) = W(e"f), f€h, 


If m > 0, the KMS condition allows just one unique 
(quasifree) (7^, 3)-KMS state: 
wal W(f)) ‘— e (/ (f. 39), p == (e^ == 1i 


The GNS representation associated to the pair 
(W(5,,),W3) is the well-known Araki-Woods repre- 
sentation, given by 


Haw ^ F(b, Dm), 
raw(W(h)) = We( (1 +0) Phap ?h), heb, 


Naw := QF, 


Here h,, is the Hilbert space conjugate to h,,, Wgl.) 
denotes the usual Weyl operator on the Fock space 
T(b,, 9b,,) and QOgz€I(b,co5,) is the Fock 
vacuum. The Liouvillean Law (see [4]) can be 
identified with dT (€ @ —e). 

The local von Neumann algebra generated by 
{taw(W(h))|h € b,,(O)} is denoted by Raw(O). The 
algebra of observables for the free quantum field 
(and, as we will see, the P(ó); model) is the norm 
closure 

—M————————————QCs 
A= LJ Raw(O) 


OcR? 


of the local von Neumann algebras. 
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The Thermal P(¢). Model 


In 1--1 spacetime dimensions Wick ordering is 
sufficient to eliminate the UV divergences of poly- 
nomial interactions. As it turns out, the leading 
order in the UV divergences is independent of the 
temperature (in agreement with the results found in 
Kopper et al. (2001)). Thus, it is a matter of 
convenience whether one uses the thermal covar- 
iance function Cp, 


fü Mea 
Ca(hi, h2) :— (m7 — Qj L?(R) 


hy, b € S(R) 


or the vacuum covariance function Cy. to define 
the Wick ordering: 


[n/2] ! 
f) ic = - 


m=0 


eran) 


m!l(n — 2m!) 


Now let P(A) be a real-valued polynomial, which is 
bounded from below. Then Euclidean techniques 
can be used to define the operator sum 


l 
H; := Law + / :P(ġa(x)) ‘C dx 
in the Araki-Woods representation and to show that 
H; is essentially self-adjoint Gérard and Jäkel (2005). 
Thus, (the closure of) H; can be used to define a 
perturbed time evolution t++ 7! on A and the vector 


Ju je FAO | 


induces a KMS state w; for the dynamical system 
(mawl A)”, r^). 

A finite propagation speed argument (using 
Trotter’s product formula) shows that 


UA) =, PER [9] 


is independent of / for A € Raw(O),t € R fixed and 
| sufficiently large. Thus, there exists a limiting 
dynamics 7 such that 


lim [)7,(4) — 7:(A)|| 0 [10] 


for all A € Raw(O), O bounded. This norm conver- 
gence extends to the norm closure A of the local von 
Neumann algebras. 

The existence of weak" limit points (which are 
states) of the (generalized) sequence {wj}).9 is a 
consequence of the Banach-Alaoglu theorem. The 
fact that all limit states satisfy the KMS condition 
with respect to the pair (A,7) follows from [10]. To 
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prove that the sequence {u;}).) has only one 
accumulation point, 


Ug = lim Wy 1 1] 


is more delicate. Following Hoegh-Krohn, Nelson 
symmetry is used in Gérard and Jakel (2005) to 
relate the interacting thermal theory on the real line 
to the P(ó), model on the circle S! of length at 
temperature 0. The existence of the limit [11] then 
follows from the uniqueness of the vacuum state on 
the circle. The relativistic KMS condition can be 
derived by Nelson symmetry as well, using the fact 
that the discrete spectrum of the model on the circle 
satisfies the spectrum condition. Since the limit [11] 
exists on the norm closure .A of the weakly closed 
local algebras, it follows from a result of Takesaki 
and Winnink that wg is locally normal with respect 
to the Araki-Woods representation (which itself is 
locally normal with respect to the Fock representa- 
tion). Consequently, 


R4(O) := 73(A(O))” = Raw(O), 


that is, Ry(O) is (isomorphic to) the unique 
hyperfinite factor of type II. Moreover, the local 
Fock property implies that the split property holds. 


O bounded 


Perturbation Theory 


Steinmann (1995) has shown that perturbative expan- 
sions for the Wightman distributions of the :4*:4 model 
can be derived directly in the thermodynamic limit, 
using as only inputs the equations of motion and the 
(thermal) Wightman axioms. The result can be 
represented as a sum over generalized Feynman graphs. 

The method consists in solving the differential 
equations for the correlation functions which follow 
from the field equation, by a power series expansion 
in the coupling constant, using the axiomatic 
properties of the Wightman functions as subsidiary 
conditions. The Wightman axioms are expected to 
hold separately in each order of perturbation theory, 
with the exception of the cluster property. 

As expected, the UV renormalization can be 
chosen to be temperature independent, that is, one 
can use the same counterterms as in the vacuum 
case. But the infrared divergencies are more severe, 
they cannot be removed by minor adjustments of the 
renormalization procedure. Various elaborate 
resummation techniques have been proposed to (at 
least partially) remove the infrared singularities. 

Another approach has been pursued by Kopper et al. 
(2001). They have investigated the perturbation expan- 
sion of the :¢*:4 model in the imaginary-time formal- 
ism, using Wilson’s flow equations. The result is once 
again that all correlation functions become ultraviolet- 


finite in all orders of the perturbation expansion, once 
the theory has been renormalized at zero temperature 
by usual renormalization prescriptions. 


Asymptotic Dynamics of Thermal Fields 


Timelike asymptotic properties of thermal correlation 
functions cannot be interpreted in terms of free fields 
due to persistent dissipative effects of a thermal 
system. This well-known fact manifests itself in a 
softened pole structure of the Green’s functions in 
momentum space and is at the root of the failure of 
the conventional approach to thermal perturbation 
theory (Bros and Bruchholz 2002). In fact, assuming 
a sharp dispersion law, one would be forced to 
conclude that the scattering matrix is trivial (a 
famous no-go theorem by Narnhofer et al. (1983)). 

However, there seems to be a possibility to find an 
effective theory, which is much simpler and still 
reproduces the correct asymptotic behavior of the full 
theory. Disregarding low-energy excitations, Bros and 
Buchholz (2002) have shown that the 6-contributions 
in the damping factors give rise to asymptotically 
leading terms which have a rather simple form: they are 
products of the thermal correlation function of a free 
field and a damping factor describing the dissipative 
effects of the model-dependent thermal background. 
This result is based on the assumption that the 
truncated z-point functions satisfy 


lim TA tii — baa = 


Toc 


ie cQ 


while the ó-contribution in the damping factors 
exhibit, for large timelike separations T, a T? 
type behavior (in 3 + 1 spacetime dimensions). 
Bros and Buchholz (2002) have shown that the 
asymptotically dominating parts of the correlation 
functions can be interpreted in terms of quasifree 
states acting on the algebra generated by a Hermi- 
tian field à satisfying the commutation relations 


[Go (t1, x1). do0(t2, x2)] 
= Ang (ti — 12,x1,x2)Z(x1 — x2) 


Here Am, is the usual commutator function of a free 
scalar field of mass my and Z is an operator-valued 
distribution commuting with $ such that à;(Z(x4 — 
x2)) = Ds 4(x1 — x2). (Here à; denotes a KMS state 
for the algebra generated by ġo.) Intuitively speak- 
ing, the field ¿y carries an additional stochastic 
degree of freedom, which manifests itself in a central 
element that appears in the commutation relations 
and couples to the thermal background. 

As ġo describes the interacting field asymptoti- 
cally, one may expect that óo satisfies the field 


equation of the interacting field in an asymptotic 
sense. Buchholz and Bros (2002) have demonstrated 
that this assumption allows one to derive an explicit 
expression for the discrete part of the damping 
factors D; q(x) in simple models. 


See also: Axiomatic Quantum Field Theory; Quantum 
Field Theory in Curved Spacetime; Scattering in 
Relativistic Quantum Field Theory: The Analytic 
Program; Tomita-Takesaki Modular Theory. 
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Lattices, or differential-difference equations, are a 
special class of ordinary differential equations, with 
the dependent variable ¢ playing the role of time and 
an infinite number of dependent variables q, = q,,(t) 
numbered by integer indices n, characterized by a 
translational invariance with respect to the shift 
n — n 4- 1. Due to this property, such equations are 
well suited for description of processes in 


translationally symmetric systems like crystals. On 
his search for lattice models admitting interesting 
explicit solutions, M Toda discovered in 1967 the 
lattice which nowadays carries his name: 


Qn == edn«174dn — edn 7 dn-1 [1] 


Toda lattice is one of the most celebrated systems of 
mathematical physics, and a large amount of 
literature is devoted to it and to its various genera- 
lizations. Its most prominent property is “integr- 
ability," so that it is amenable to a rather complete 
exact treatment; moreover, it can be regarded as one 
of the basic models, illustrating all the relevant 
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paradigms, notions, methods, and results of the 
theory of integrable systems (sometimes called the 
theory of solitons). One has a rare possibility to read 
the first-hand presentation of a large body of 
relevant results, including the authentic story of the 
original discovery, in Toda (1989). 


The Infinite Toda Lattice 
Model 


The classical infinite Toda lattice [1] describes a one- 
dimensional chain of unit mass particles, each one 
interacting with the nearest neighbors only, q, being 
the displacement of the mth particle from equilibrium. 
It can be treated within the Hamiltonian formalism 
of the classical mechanics (with some care, because of 
the infinite number of degrees of freedom). In this 
framework, the second-order Newtonian equations 
of motion [1] are replaced by the first-order 
Hamiltonian ones, for the coordinates q, and 
canonically conjugate momenta py: 


In = Dus Pn — ed" 1 __ ed" an-ı [2] 


The corresponding Hamilton function is 


-Dp + EM (en —Qn __ 1) [3] 


27 nc, 


One can understand infinite sums here formally, or, 
alternatively, one can impose suitable boundary condi- 
tions, like 44,1 — qn > 0,p, — 0 as |[n| — oo (usually 
one requires decay faster than any degree of 1/|n]). 


Multisoliton Solutions 


M Toda found in 1967 a number of exact traveling 
wave solutions of this system, including the 1-soliton 
solution: 


1 + e72(1mn-P1t+Ó61) 
i (EY = log —— + e 2(n(1-1)-P811+01) 4] 
or, equivalently, 


" Bi 

ed» (t) qu(t) — ] + 1 

cosh? (^4 — Bit + 61) 

where yı > 0, 61 = sinh y¡, and 6; is an arbitrary 

phase. Such a soliton moves with the velocity 
= 3,/7 (to the right, if v; >0, and to the left, if 

v1 « 0). Note that the faster the soliton is, the larger its 

amplitude. Multisoliton solutions were constructed in 

1973 by R Hirota with the help of his ingenious 

“direct” (or bilinear) method. They can be written as 


Tn+1 [EXTA (t) 


ed" (t)—qn(t) — 
Ta (t) 


[6] 


where, for an M-soliton solution, 7,(t) can be 
represented through the M x M determinant depend- 
ing on 2M parameters z; € (—1, 1) and c; € R: 


wena) 7 
1<ij<M 


T(t) = det (^ + 
i= Sij 

where cj(f)— cje?*, 8; — (1/2)(z;' —zj). If one sets 
zj— te " with y » 0, then 5; — sinh; and one 
can show that asymptotically both for + —^ —oo and 
for t — --oo the solution [6] looks like the sum of 
well-separated solitons [4] with the velocities 
vj = 3;/y and the respective phases yn — jt + 2) ET 
This is usually interpreted as a particle-like behavior 
of solitons. One can show that the scattering of 
solitons is factorized: 


= Y tog 22 
U, <U; ze 4% 
= Ze 
: Yu 2M [8] 
UL VU; Zj dii Sk 


which means that the phase shifts of individual 
solitons can be interpreted as coming from the 
pairwise interactions only. 


Integrability 


The infinite Toda lattice is completely integrable in 
the sense of the classical Hamiltonian mechanics: it 
admits an infinite number of functionally indepen- 
dent integrals of motion in involution. This was 
demonstrated in 1974 by M Hénon. An instance of 
these higher integrals of motion is given by 


)= 9 + Y b be) erc [3] 


nc 7, neZ, 


Hamiltonian flows corresponding to the higher 
integrals of motion (usually referred to as higher 
Toda flows) form the “Toda lattice hierarchy.” A 
beautiful approach to this hierarchy is based on the 
Lax representation of the Toda lattice, discovered in 
1974 independently by H Flaschka and S Manakov. 
In the variables a,,, bn, related to qn, p, by 


bs zb [10] 


equations of motion of the Toda lattice [2] are 
rewritten as 


An _ ed" da 


An = anb. = bn), b, = Ay — An-1 [11] 


It turns out that eqns [11] are equivalent to the 
operator equation 


L = [L, A+] = [A-, L] 112] 


where L and A. are linear difference operators with 
coefficients depending on a,, bn: 


L= > b Enn + y AnEn 541 + s Ensina [13] 


ncz nc, ncz, 


Aj, — » Die F y Bio 


nez, neZ 
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ncZ, 


[14] 


Here difference operators are represented as infinite 
matrices, E,,,, being the matrix with the only 
nonvanishing element equal to 1 in the position 
(m,n). A diagonal similarity (gauge) transformation 
of the matrix L leads to an equivalent Lax 
representation of the Toda lattice: 


Lo = [Lo, Ao] [15] 
with 
Lo — >. b, Ena + F aM? (nr > Enayi) [16] 
nEZ nc7, 
1 
Ao = 22,4. (Esai e Esai) [17] 


Being equivalent for the Toda lattice, these two Lax 
representations admit nonequivalent generalizations 
(see below). Note that the matrices A. in [14] may 
be interpreted as A, — 7. (L), where m+ stands for 
the lower-triangular, resp., strictly upper-triangular 
part. The commuting higher members of the Toda 
lattice hierarchy (enumerated by s € N) are char- 
acterized by the Lax equations of the form [12] with 
the same Lax matrix L as in [13] and with 
A.-—m4(L5) In the Lax representation [15], the 
higher Toda flows are obtained by choosing 
Ao =skew(Lj), where “skew” denotes the skew- 
symmetric part (strictly lower-triangular part minus 
strictly upper-triangular part) of the symmetric 
matrix. The Hamilton functions of the higher flows 
are obtained as H, ~ tr(L*) = tr( Lẹ). 


Inverse Scattering 


H Flaschka and S Manakov laid the Lax representa- 
tion into the base of the application of the inverse- 
scattering, or  inverse-spectral, transformation 
method (IST) to the infinite Toda lattice. It was the 
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first application of IST in the lattice context. The 
matrix Lo in [16] is symmetric tridiagonal, which 
yields that the operator Lo is second order and self- 
adjoint. The direct and inverse-spectral problem for 
Low = uy with such operators Lo is well studied and 
parallel, to a large extent, to the corresponding 
theory for second-order differential operators. In the 
rapidly decaying case, the set of spectral data of the 
operator Lo, allowing for a solution of the inverse 
problem, consists of: 


1. eigenvalues y =z; + z;! of the discrete spectrum, 
with Zj € (—1,1); 

2. normalizing coefficients y; of the corresponding 
eigenfunctions; and 

3. reflection coefficient r(z) for |z| = 1, characterizing 
the continuous spectrum =z 4- z ! € [-2,2]. 


The solution of the inverse-spectral problem is given 
in terms of the Riemann-Hilbert problem or its 
variants, like the Gelfand-Levitan equation. Equa- 
tion [12] means that the evolution of the operator L, 
induced by the evolution of q,(t), p,(t) in virtue of 
the Toda lattice equations [2], is *isospectral." More 
precisely, the discrete eigenvalues are integrals of 
motion, while the evolution of other spectral data is 
governed by simple linear equations: 


y(i) = (0)e 55 79 
r(z, t) = r(z,0)e* =?! 


2; = const., 


[18] 


In particular, the multisoliton solutions correspond 
to the reflectionless case r(z, ?) = 0. The IST solution 
of the initial-value problem for the infinite Toda 
lattice can be schematically depicted as in Figure 1. 


Bi-Hamiltonian Structure 
The canonical Poisson bracket for the variables q», Pn 


turns in the Flaschka—Manakov variables [10] into 


{by, dn) = —An, 5, busi}; = —An [19] 


gr(9), Pr(0) Direct-spectral problem 


z; (0), r(z, 0) 


Linear 
evolution 


qalt), Palt) Inverse-spectral problem 


Figure 1 General scheme of the IST. 
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(all other brackets of the coordinate functions 
vanish), and the system [11] is Hamiltonian with 
respect to this bracket, with the Hamilton function 


H» =D 


However, one can define also a different Poisson 
bracket for the variables a,, bn: 


(bs, an); = —by4y 
idas Bar h = —Anbn41 
{bn, b, h = —ân 


ln, 4n+1 h = —Anån+1 


[20] 


with the following properties: it is compatible with 
the first one (i.e., their linear combinations are again 
Poisson brackets), and the system [11] is Hamilto- 
nian with respect to this bracket, with the Hamilton 
function Hı = >> by. So, the Toda lattice in the form 
[11] is a bi-Hamiltonian system. This result is due 
to M Adler (1979). The bi-Hamiltonian property, 
introduced by F Magri in 1978 on the example of 
the Korteweg-de Vries equation, has been estab- 
lished since then as an alternative (and highly 
effective and informative) definition of integrability. 
Actually, the Toda lattice |11] is even tri-Hamiltonian, 
since there exists one more local Poisson bracket for 
the variables a,,, b,, with similar properties, discovered 
by B Kupershmidt in 1985. 


Darboux-Bácklund Transformations 
and Discretization 


A further indispensable attribute of integrable 
systems are the so-called Darboux—Backlund trans- 
formations. For the Toda lattice they were first 
found by M Toda and M Wadati in 1975. A 
Backlund transformation (qn, Pn) — (Gn, Pn) with the 
parameter h can be written as 


1+ hpn — edn d p^ gd» dni 21 

1 + hp, — edd» af p? edv» 
This is a canonical transformation, possessing a 
classical generating function. These formulas can be 
given a fundamentally important interpretation in 
terms of the matrices 


Uy = ER et" AE, +h x Entin [22] 


neZ, nc, 
U — Ib ec -@E, 1 [23] 
nc, 


The first formula in [21] is equivalent to the 
factorization I + hL=U,U_, while the second one 
is equivalent to the factorization I+ bL — U U, 


with the flipped factors. The Backlund transforma- 
tion [21] serves also as an integrable discretization 
of the Toda flow [2] with the time step ^. 


Finite Open-End Toda Lattice 
Model 


The infinite Toda lattice [1] can be reduced to finite- 
dimensional systems by imposing suitable boundary 
conditions, different from the rapidly decaying ones. 
Particularly important are “open-end boundary 
conditions," which correspond to placing the parti- 
cles 0 and N+1 at qg=+o00 and qwi1- —oo, 
respectively. In terms of the Flaschka-Manakov 
variables, this means that a9 — aw —0 and by= 
bx,1=0. The Hamilton function of the resulting 
system with N degrees of freedom is 


1 N N-—1 
H2(p,q) = 22 25 + ». ed"! Qn [24] 
n=] n=1 


This system consists of N particles subject to 
repulsive forces between nearest neighbors, and 
exhibits a scattering behavior both as t — —oo and 
t — +00. It admits a Lax representation of the same 
form [12] or [15] as in the infinite case, but with all 
the matrices being now of finite size N x N, so that 
[13]-[14] and [16]-[17] are replaced by 


N N-1 N-1 
L = 2. bns T 2. Gs Eg nj Ea 2 Eg. ln [25] 
n= n= n= 


N N-1 
Ax - Za D Es xa Za FE 1.7 
d n= 


[26] 

N-1 

Ao = A sand 
=} 


and 


N N-1 
Lo = x bak, a T ` al? (Es, ln + Enayi) [27] 


511 n=1 


1 N-1 
Ao = 2 > ia alt T di E, 521) [28] 
=1 


The qualitative behavior of the solutions is easily 
understood: as a consequence of repulsive interac- 
tions, the pairwise distances between particles grow 
infinitely, a,,(t) = e491") — 0 as t — +00, so that 
the matrix Lo becomes asymptotically diagonal, 
with the limit velocities b,(=+00)=q(+00) as the 
diagonal entries. Due to the isospectral evolution of 
Lo, these limit velocities have to coincide with the 
eigenvalues u; of Lo, which are integrals of motion. 


As t— —oc, they appear on the diagonal in the 
increasing order (the rightmost particle qı being the 
slowest, and the leftmost qw being the fastest), while 
as t —^ +00, their order on the diagonal changes to 
the decreasing one (the particle q; becoming the 
fastest and qu becoming the slowest). 


Moser’s Solution 


Integration of this system has been first performed by 
J Moser in 1975. His solution can be interpreted 
within the general scheme of the IST (see Figure 1). 
The spectral data in this case consist, for example, of 
the eigenvalues j4(j — 1,..., N) of the matrix Lo and 
the first components r; of the corresponding ortho- 
normal eigenvectors. The evolution of these data 
induced by the Toda flow [2] turns out to be simple: 


7 (0)e/ 


2 
i, = const., r(t) = —————— 29 
" (OSX aa A 
The IST is expressed by the identity 
j=l ] ! l p—-b- = 
+ ANA 
jt — bn 


both parts of which represent the entry (1,1) of the 
matrix (ul — LN ^. It implies that all variables 
da,(t),b,(t) are rational functions of jj; and e^; in 
particular, one finds: 


-- Nn ) n 1 (£) 
n\É n+1(1)=qn(1) Tn-1(t)Tn+1 (t) 31 
a ( ) ef q } t | | | 


n 


where 7,(f) can be represented as an n x n Hankel 
determinant 


T(t) = det(cj a (£)) o. pent 


> i. 32 
a(t) = Y ante) m 
i=1 


Factorization Solution 


The Lax representation [12] is a particular instance of 
a general construction, known under the name of 
Adler-Kostant-Symes (AKS) method and found 
around 1980. The ingredients of this construction are: 


* a Lie algebra g, equipped with a nondegenerate 
scalar product which is used to identify g with its 
dual space g'; 

e a splitting of g into a direct sum of its two 
subspaces g, which are also Lie subalgebras, with 
T+:8— g+, being the corresponding projections; 
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e the Lie group G of the Lie algebra g, and its Lie 
subgroups G+ with the Lie algebras g,; and 

e a function ġ:g— g covariant with respect to the 
adjoint action of G (in the case of matrix Lie 
algebras and groups, one can take, e.g., 
(L) = L^). 


The AKS method provides a formula for the solution 
of the initial-value problem for Lax equations of the 
form [12] with the Lax matrix Leg and 
A+ =7+(@(L)). The solution is given by 


L(t) = U;'(t)L(0)U,.(t) = U-(t)L(0)U- (t) [33] 


where the elements U, (1) € Gs solve the factoriza- 
tion problem 


exp(to(L(0))) = U+ (rU. (t) [34] 


For the open-end Toda lattice g=gl(N), the Lie 
algebra of all N x N matrices, g} consist of all 
lower-triangular, resp., strictly upper-triangular, 
matrices. Accordingly, G — GL(N), the Lie group 
of all nondegenerate N x N matrices, and G+ 
consist of all nondegenerate lower-triangular 
matrices, resp., of upper-triangular matrices with 
units on the diagonal. The corresponding factor- 
ization problem in G is well known in the linear 
algebra under the name of LR factorization, and is 
related to the Gaussian elimination. From [33] and 
the well-known expression of the diagonal ele- 
ments of the lower-triangular factor in the LR 
factorization through the minors of the factorized 
matrix, we find: 


an(t) = AUS 


n 


a, (0) [35] 


where 7,(f) is the upper-left nxn minor of 
the matrix exp(tL(0)). If L(t) is the Lax matrix 
along the solution of the Toda flow (ó(L) — L), then 
the sampling of the matrix exp(L(t)) at the integer 
times t € Z coincides with the result of application 
of the Rutishauser's LR algorithm to the matrix 
exp(L(0)). The LR algorithm applied to the matrix 
I 4- bL(0) is nothing other but the Backlund trans- 
formation [21] in the open-end situation. 


Finite Periodic Toda Lattice 
Model 


A different reduction of the infinite Toda lattice to a 
finite-dimensional system appears by imposing peri- 
odic boundary conditions, q,.N(f) = qn(t) for all 
n € Z, (of course, such relations hold also for the 
Flaschka-Manakov variables a,,, b,,). The Hamilton 
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function of the resulting system with N degrees of 
freedom is 


1 
H(p) =3 2, Pat 5, e" [36 


neZ/NZ neZ/NZ 


This system consists of N particles q,(n — 1,..., N), 
and it is always assumed that qu +1 = qi and qo = qw. 
Thus, the potential energy in [36] differs from the 
potential energy in [24] by one additional term e?! ?", 
However, this modest difference leads to much more 
complicated dynamics of the system (quasiperiodic 
instead of scattering). It is convenient to replace 
infinite matrices in the Lax representation [12] by 
finite ones, of size N x N, but depending on an 
additional parameter A (called the spectral parameter): 


L= | d buEnn tA 3 an Enn 


neZ/NZ neZ/NZ, 
+A > | ae ee [37] 
neZ/NZ 
A, = Ss b, E, tÀ Ed En+1,n [38] 
ncZ/NZ nEZ/NZ 
A = are M» As En n+ [39] 
ncZ/NZ7 


The Lax representation [12] holds identically in A, 
so that the spectral parameter drops out of the 
equations of motion. Note that, unlike the open-end 
case, L is no more a tridiagonal matrix, because of 
the nonvanishing entries in the positions (N, 1) 
and (1, N). 


Inverse-Spectral Transformation 


Solution of the periodic lattice in terms of multi- 
dimensional theta functions has been given indepen- 
dently by E Date and S Tanaka, and by I Krichever 
in 1976. In this case, the set of the spectral data is 
more complicated; it includes: 


e a hyperelliptic Riemann surface R of genus N — 1 
determined by the eigenvalues of the periodic 
boundary-value problem for the operator L, or, 
in other words, by the equation R(A,4)— 
det(L(A) — uI) =0; and 

e N— 1 points P, on R, which correspond to the 
eigenvalues of L with vanishing boundary 
conditions. 


Due to [12], the Riemann surface R itself is an 
integral of motion, and the evolution of points P, is 
such that the image of the divisor P4 +---+Py_ 
under the Abel map moves along a straight line in 
the Jacobi variety of R. Solution of the inverse- 
spectral problem is given in terms of 


multidimensional theta-functions by formula [35] 
with 7,(t) 20(nU —tV+D), where U,V,D are 
certain vectors on the Jacobian of R (the first two 
of them depending on the spectrum R only). 


Loop Algebras 


The periodic Toda lattice can be included into the 
general AKS scheme, if one interprets the Lax 
matrix L as an element of the loop algebra g 
which consists of Laurent polynomials (in A) with 
coefficients from gl(N), singled out by the additional 
condition 


g= [L(A) € gl(N)D, A7] : QL(A)Q™* = L(wA)) 


where Q=diag(1,w,...,w~!), w= exp(2ri/N). Sub- 
algebras g, consist of Laurent polynomials with 
respect to non-negative, resp., strictly negative 
powers of A. The Lie group G corresponding to the 
Lie algebra g consists of GL(N)-valued functions 
U(A) of the complex parameter A, regular in 
CP'\{0,co} and satisfying QU(A)QO ^! —U(wA). Its 
subgroups G+ corresponding to the Lie algebras g, 
are singled out by the following conditions: elements 
of G, are regular in the neighborhood of A=0, 
while elements of G_ are regular in the neighbor- 
hood of A=oo and take at A=oo the value I. The 
corresponding factorization is called the generalized 
LR factorization. As opposed to the open-end case, 
finding such a factorization is a problem of the 
Riemann-Hilbert type which is solved in terms of 
algebraic geometry and theta-functions rather than in 
terms of linear algebra and exponential functions. This 
approach to the periodic Toda lattice is due to Reyman 
and Semenov-Tian-Shansky (1979) and, indepen- 
dently, to M Adler and P van Moerbeke (1980). 


Generalizations: Lie-Algebraic Systems 


The AKS interpretation of the finite Toda lattices 
leads directly to their generalizations by replacing 
the algebra gl(N), resp., the loop algebra over gl(N), 
by simple Lie algebras, resp. affine Lie algebras. 
These generalized Toda systems were introduced in 
1976 by O Bogoyavlensky and solved in 1979 
independently by M Olshanetsky, A Perelomov, 
and by B Kostant. 


Simple Lie Algebras 


Let g be a simple Lie algebra (complex or real split), 
and hits Cartan subalgebra. Let further A= A, UA. 
be the root system of g, decomposed into the sets of 
positive roots A, and the set of negative roots A_. 
One has a direct vector space g= g, ® g_, where g, is 
spanned by the root spaces for positive roots and by h, 


while g_ is spanned by the root spaces for negative 
roots (Borel decomposition). For a € A let E, be a 
corresponding root vector. So, [H, Ea] = o(H)E, for all 
H € b. The root a € h* may be identified with Ha € b 
defined by (Ha, H) =a(H) for all H € D. It is easy to 
deduce that [E;, E-a] = c4H,, where Ca - (Es, E-a). 
The system of simple roots will be denoted by ® C A,. 

The generalized Toda lattice for the Lie algebra g 
is the following system of differential equations on 


bx b: 
O=P 
P et a » ENE, Ea] SERA >. ce 9) H,, [40] 


acd aco 


This system can be given a Hamiltonian formula- 
tion, with the Hamilton function 


1 
EN > | a(Q) 
H, — 5 (P, P) + 2. Cal [41] 


It is completely integrable, and has a Lax represen- 
tation [12] with 


L=P+) Eat OE, [4] 
aco acd 

Ay = P T »— A_ — JE mE, [43] 
ace ac 


The usual open-end Toda lattice corresponds to the 
algebra sl(N) (series Ax. 1), so that the Hamilton 
function [24] can be denoted by Ha, ,. The 
Hamilton functions of the generalized lattices 
corresponding to other classical algebras so(2N + 
1) (series By), sp(N) (series Cy), and so(2N) (series 
Dy) can be written in the canonically conjugate 
variables gy, p,(n — 1,..., N) as 


eM, g= Bn 
g=Cn [44] 
e ININ F g= Dy 


Hg(p.q) — Hay (P.) T 


Affine Lie Algebras 


Turning to the generalizations of the periodic Toda 
lattice, let O be a Coxeter automorphism of a simple 
complex algebra g, the order of 0 being m. Introduce 
the loop algebra g as the Lie algebra of Laurent 
polynomials 


g = (LO) € gl, 471] : 6(1(3)) = L(wA)) 


where w= exp(27i/m). Denote by g; the eigenspaces 
of 0 corresponding to the eigenvalues w (j € Z/mZ). 
Set a — go, and let s denote the dimension of a. By 
definition of the Coxeter automorphism, a is an 
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abelian subalgebra of g. Denote by Y the set of a € 
a* for which there exist nonzero elements E, € g, with 
[H, Ea] - o(H)E, for all H € a. The elements E_, € 
g_, are defined similarly. It can be shown that Y 
contains s+ 1 elements, so that between them there 
exists exactly one linear relation. The elements of VU 
are called simple weights of the loop algebra g. The Lie 
algebra g is a direct sum of its two subspaces g, 
consisting of Laurent polynomials with non-negative, 
resp., with strictly negative powers of A; these 
subspaces are also Lie subalgebras. 

Now the generalized Toda lattice related to the loop 
algebra g can be introduced as the system of differential 
equations on a x a, which looks formally exactly as 
[40], and has the Hamilton function which looks 
exactly as [41], but with the set of simple roots ® of g 
being replaced by the set of simple weights V of g. The 
matrices participating in the Lax representation [12] 
belong now to the loop algebra g: 


LQ)-P-AM E,-X! Me OE , [45] 


ae ae 
A,(A)=P+A E, 
acy 
46 
AO AS R MN 
acy 


For the classical series of loop algebras, the 
Hamilton functions Hg in the canonically conjugate 
variables qn, p,(n — 1,..., N) can be presented as 


Hg(p,4) E* Hay 5 Pd) 


e 4N qe nta, g= BN 
e ^N Le z= C 
e 4N 4N-1 pedit? ] g = pi [47] 
e MN peT +92 g= AY 
e IN 4 e’, g — AN 
A 2 
EN e7, g - Dy 


Actually, one can find even more general integrable 
systems of the Toda type: one can add to Ha, ,(p,4) 
any of the two potentials eIN79N-1 or ae IN + Bean 
on one end combined with any of the two potentials 
ent or ef!--6e^ on the other end, where 
a,3,y7,6 are arbitrary constants. This result is due 
to E Sklyanin (1987). 


Generalizations: Lattices with 
Nearest-Neighbor Interactions 


There exist further integrable lattice systems with 
the nearest-neighbor interaction apart from the 
classical exponential Toda lattice [1]. Those of the 
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type d&— r(qa)(g(qua1 — In) — 8[Qn — qn-1)) have 
been classified by R Yamilov in 1982, and the list 
contains, apart from the usual Toda lattice [1], the 
following ones: 


dn E Gal ert de mu ee eL) [48] 


Gn = Galan — Zaa +dn-1) [49] 


1 1 
Y 7 2 
E ~ 7) {| —A>>— E- 
q (q (— — Qn qn indi —) | | 


dn vet (47 E y”) (coth(qm+1 > qn) 
— coth(q» — qn-1)) [51] 


Equations [48] are known as the “modified Toda 
lattice.” Equations [49] describe the “dual Toda lattice” 
which was instrumental in the original discovery by 
Toda (see Toda (1989)). All systems [49]-[51] can be 
obtained from [11] via suitable parametrizations of the 
variables a,, b, by canonically conjugate ones gy, Pn, 
similar to [10] for [1], see Suris (2003). 

A remarkable discovery of the integrable relati- 
vistic Toda lattice is due to S Ruijsenaars (1990). 
This lattice with the equations of motion 


ed» —4Un 


dy Uc oia (a Feds a 


edn an-ı 
1 + exi) [52] 


can be considered as the perturbation of the usual 
Toda lattice with the small parameter a (the inverse 
speed of light). 

A class of integrable lattice systems of the relativistic 
Toda type Qn = r(qunY(qnaf (qna m An) um Gn-Af (dn m 
qn-1) + 8(Qn+1 — da) — g(dn — dn-1)) is richer than 
that of the Toda type, and has been isolated by Yu 
B Suris and by V Adler and A Shabat in 1997. The list 
contains, apart from the relativistic Toda lattice [52], 
two more a-perturbations of the usual Toda lattice [1]: 


= (1 + ad4 1) 


qn = (1 + W741 ert % — (1 + Od, 1)e7" de 
== a? Ce =a eet} | [53] 


dn = (1 = adn) ( u Ani) edn+1— dn 
= (1 = ad, 1 e I ') [54] 


two a-perturbations of the modified Toda lattice [48]: 


edn+1 —Qn 
va =< 7 Qn41—dn Qn —Qn-1 ^ 
ES e — E Q Sa a xL 
Ak in( F Qn» LL ah ifa 


T [55] 


— Q ^ T ———————— 
dn l 1 + Quel dn 1 


o edn« dn 
Qn = q,(1 m adn) (a 04n+1) 1 + (yedn1— dn 
: edn d»-1 
0-1) 2 as) [56] 


two a-perturbations of the dual Toda lattice [49]: 


" l A4n+14n 

n = n n — 2 n + PE SSS 
dn = n(Gn+i — 24n + dn-1) Dr. 2 
GnGn-1 


E 14 (dn B dn-1) 97 


v" à è Qual — dn — 04n+1 
n — Un 1 +a? n (Sidi dea 
1 3 1 ) 1 + alany E Qn) 


Qn — An-1 — QQn-1 
— 1a A 58 
1 F Ga = da-1) | | 


and one a-perturbation of each of the systems [50] 
and [51]: 


" =) 2 Qn+i — dn — QQn4A 
Qn == (q;, mi. ) 
(E o qs) m (va)? 


|. dn — dn-1 — Odn-1 ) [59] 


: sinh 2(q4,1 — qn) ^ v * sinh(2va) gn 
sinh^(q,,1 — qn) — sinh” (va) 


. Sinh 2(q, — qn-1) — v ! sinh(2va)q, 4 (60) 
sinh” (qn — Gn-1) — sinh” (va) 


A detailed study of all these systems, their interrelations, 
and time discretizations can be found in Suris (2003). 

There exist also lattices with more complicated 
nearest-neighbor interactions, involving elliptic 
functions. They were discovered by A Shabat and 
R Yamilov (1990), and by I Krichever (2000). For 
example, the nonrelativistic elliptic Toda lattice is 
governed by the equations 


du = (9; = 1) (V (qn, qn+1) + V(qn,4n-1)) [61] 


where V(q,q')=C(q +q) + Q(q — q) - (2q) is an 
elliptic function in both arguments q, q' (here ¢(q) is 
the Weierstrass C-function). 


Further Developments 
and Generalizations 


Sato's Theory 


Formulas [6], [31], and [35] have the same structure, 
with the case-dependent functions 7,,(t) given by the 
determinants [7] for the multisoliton solution in the 


infinite case, by the Hankel determinants [32] or by the 
minors of the matrix exp(L(0)) in the open case, and 
by the multidimensional theta functions in the periodic 
case. All these seemingly different objects are actually 
particular cases of a beautiful construction due to M 
Sato (1981), developed by E Date, M Jimbo, M 
Kashiwara, T Miwa (1981—83), and by G Segal and G 
Wilson (1985), which provides one of the major 
unifying schemes for the theory of integrable 
systems. In this construction, integrable systems are 
interpreted as simple dynamical systems on an infinite- 
dimensional Grassmannian. The T-function (first 
invented by R Hirota in 1971) receives in this theory 
a representation-theoretical interpretation in terms of 
the determinant bundle over the Grassmannian. 


Band Matrices 


The Lax matrices [13] and [16] in the Manakov- 
Flaschka variables can be easily generalized: in the 
symmetric matrix Lo one can admit nonvanishing 
elements in the band of the width 2s + 1>3 around 
the main diagonal, in the Heisenberg matrix L one 
can admit more nonvanishing diagonals in the 
upper-triangle part. A systematic presentation of 
a large body of relevant results is given in 
Kupershmidt (1985). In the setting of finite lattices, 
the integrability of such systems becomes a non- 
trivial problem (as opposed to the tridiagonal 
situation), because the number of independent 
conjugation-invariant functions tr(L*) becomes 
less than the number of degrees of freedom. An 
effective approach to this problem based on the 
semi-invariant functions has been found by P Deift, 
L-Ch Li, T Nanda, and C Tomei in 1986. 


Two-Dimensional Toda Lattices 


Up to now, we considered integrable lattices with 
one continuous and one discrete independent vari- 
ables. This allows for a further generalization. 
Integrable systems with two continuous and one 
discrete independent variables are well known and 
widely used as models of the field theory. For 
instance, the Toda field theory deals with the system 


(dn)sy — edn+1— dn — ed" dn | [62] 


introduced in the soliton theory by A Mikhailov in 
1979. This two-dimensional system admits all possi- 
ble kinds of reductions and generalizations mentioned 
above for the usual Toda lattice. In particular, the 
periodic two-dimensional Toda lattice is referred to 
as the affine Toda field theory (with the prominent 
example of the sine-Gordon field which corresponds 
to the period 2). Later, it was realized that the 
equivalent equation (log vy)... =Vn+1 — 2v, + Vs.1, 
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which is obtained from [62] by setting v,= 
exp(quii— 4n), already appeared in studies by 
G Darboux in the 1880s, as the equation satisfied 
by the Laplace invariants of the chain of Laplace 
transformations of a given conjugate net. This 
relation to the classical differential geometry was 
extensively studied by G Darboux, G Tzitzéica, and 
others long before the advent of the theory of 
integrable systems. Another link to the differential 
geometry is a more recent observation, and relates the 
two-dimensional Toda lattice, with the d'Alembert 
operator (-),, on the left-hand side of [62] replaced by 
the Laplace operator (-),,, to harmonic maps. For 
instance, the sinh-Gordon equation z;; = sinh u gov- 
erns harmonic maps from C into the unit sphere S?, 
which can be interpreted also as Gauss maps of the 
constant mean curvature surfaces in R?. A review of 
this topic can be found in Guest (1997). 

Discretization of Toda lattices, nonabelian Toda 
Lattices, quantization of Toda lattices, dispersionless 
limit of Toda lattices, etc., are only some of the 
further relevant topics, which cannot be discussed in 
any detail in the restricted frame of this article, and 
the same holds, unfortunately, for such fascinating 
applications of the Toda lattice as the Frobenius 
manifolds, Laplacian growth problem, quantum 
cohomology, random matrix theory, two-dimensional 
gravity, etc. 


See also: Bácklund Transformations; Bi-Hamiltonian 
Methods in Soliton Theory; Classical r-Matrices, 

Lie Bialgebras, and Poisson Lie Groups; Current Algebra; 
Dynamical Systems and Thermodynamics; Functional 
Equations and Integrable Systems; Integrable Discrete 
Systems; Integrable Systems and Discrete Geometry; 
Integrable Systems and the Inverse Scattering Method; 
Integrable Systems: Overview; Lie Groups: General 
Theory; Multi-Hamiltonian Systems; Quantum 
Calogero-Moser Systems; Separation of Variables for 
Differential Equations; Solitons and Kac-Moogy Lie 
Algebras; WDVV Equations and Frobenius Manifolds. 
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Introduction 


A finite Toeplitz matrix is an x m matrix with the 
following structure: 


a0 a_| a_2 n4] 

a a 4-1 A_ni2 

a2 a1 ay) A_n+3 [1] 
An-1 Gn-2 — dg—3 7^7 ay 


The entries depend on the difference i — j and hence 
they are constant down all the diagonals. There are 
two cases when the determinant is easy to compute. 
One is when the matrix is upper- or lower-triangular 


and the determinant is aj. The other case is when 
the matrix is of the form 


ay Gn—-1 An—2 ELE 41 

a ay An-1 cee 9 

a2 ay ap ELEME 9 [2] 
Qn-1 An-2 An-3 *** AO 


In this latter case, the matrix is called a circulant 
matrix and the eigenvalues are given by the formula 
Yale), O<k<n-1 


where 


The corresponding 
Vy (e?7*/") is 


eigenvector for eigenvalue 


(1, ei2mk/n A gt 


This can be verified by direct computation. The role 
of circulant matrices will not be emphasized in this 
article, although they are used in the computation of 
the generating function for certain dimer configura- 
tions and also in applications using the discrete 
Fourier transform. 

The most common way to generate a finite 
Toeplitz matrix is with the Fourier coefficients of 
an integrable function. Let $: T — C be a function 
defined on the unit circle with Fourier coefficients 


_1 [DU and 
be =z; | eene Hd 3 
We define T,,(¢) to be the Toeplitz matrix: 


Talo) = (dij); n 


A basic problem that in large part has been 
motivated by statistical mechanics is to determine 
the behavior of the asymptotics of the determinant 
of T,(ó) as n — oo. The determinant will be 
referred to as D,(ó), where ¢ is called the generating 
function of the determinant. If the generating 
function has the property that its Fourier coefficients 
vanish for negative index (positive index) then the 
corresponding matrix is lower-triangular (upper- 
triangular) and hence the determinant is $j. For 
other cases, the determinant is not easy to determine 
and requires additional mathematical machinery. 
Some of the primary motivation to study the 
determinant of these matrices comes from the two- 
dimensional Ising model. We consider the Onsager 
lattice in the absence of a magnetic field with sites 


labeled by 

(2,7), 
and with a value o;;— +1 assigned to each site. In 
the Ising model, c; ; signifies the state of the spin at 


the site (i,j). To each possible configuration of spins, 
we define an energy 


E(o) = —E19 egona- Er Y 05505541 
ij ij 


0<1<M,0,<j<N 


Let 
7 = y e-bElo) 
o 


be the partition function. Then the probability of a 
given configuration is 
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Here Fi, Ex, and PB=1/kT are, without loss of 
generality, assumed to be positive constants, T is the 
temperature, and k is the Boltzmann constant. If X 
is a random variable defined on the space of 
configurations, the expectation is given by 


E(X) => Y. Kije E 


ó—t1 


Let n be fixed for the moment and assume toroidal 
boundary conditions for the lattice and then let 
N,M — oc. It is known that the random variable 


X(0) = 00,000, 


has expectation (00000) given by D, (4), where 


: (1 — o4e)(1 — ane) Hs 
yee) = (cnet cen) 


(1— o1e79)(1 — aae?) 


and 


zı = tanh GE), z2 = tanh BE; 
The square root is taken so that ¢(e'")=1. This 
formula was first stated by Onsager and later 
verified in a difficult computation by Montroll, 
Potts, and Ward. 

The spontaneous magnetization M for the Ising 
model is defined by 


M? = lim (9000) = lim D,(9) 


m noo 


Note that it is the square root of the correlation 
between two distant sites. Hence, the asymptotics of 
the Toeplitz determinants will determine whether 
the magnetization is positive or tends to zero as 
n — oo. 


Strong Szegó Limit Theorem 


To determine the behavior of the determinants, we 
need to analyze the generating function ¢. Let us 
first consider the case where o < 1. (It is always the 
case that 0 < a; < 1.) This generating function is 
differentiable, nonzero and has winding number 
zero, and it is for functions of this type that a 
second-order expansion of the Toeplitz determinants 
can be described. The expansion first formulated by 
Szegó, in response to the question concerning the 
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spontaneous magnetization, is called the “strong 
Szegó limit theorem.” 

Before proving the Szegó theorem, it should be 
remarked that we can view the finite Toeplitz matrix 
as a truncation of an infinite array, 


do 9-1 0-2 
Ó1 o Q- 
1 1 " 


02 01 Qo 


The above infinite array is the matrix representation 
for the Toeplitz operator 


T(¢) : H > H? 


defined by 


T(¢)f = P(óf) 
where H? is the Hardy space 


(f € LAT) | fe =0, k < 0} 


the function ó € L*(T), and P is the orthogonal 
projection of L^(T) onto H?. The matrix representa- 
tion given in [4] is with respect to the Hilbert space 
basis of H?, 


(e? |0 < k < co} 


and ó is called the symbol of the operator. Now 
define P,: H? — H? by 


Pslfo.fa.f 2) — (oa e ¿a 0 ja: 


The finite Toeplitz matrix can be thought of as the 
upper-left corner of the array given in [4] or as 
P,T()Pp. 

To prove the Strong Szegó limit theorem, we 
introduce the Banach algebra B of bounded func- 
tions f satisfying Y ~ lA? < oo. 


Theorem 1 (Strong Szegó limit theorem). Assume 
o = $ $,, where ġ+ have logarithms in B. Suppose 
log ó., logó, € H?. Then 


lim D,(6)/G(6)" = E(9) = exp (X ks] 
k=1 


where G(ó) = exp ((logó)g) and s, = log dy. 


Since B is a Banach algebra, it follows that if 
log ds belong to B so do 


$04, 01, 07, 0, 9! 


and hence they are bounded. Since ó, is in H? as 
well, its Fourier coefficients vanish for negative 
index and the Toeplitz operator has a corresponding 
infinite array that is lower-triangular. The Fourier 
coefficients vanish for positive index for $. and 


hence the infinite array is upper triangular. From 
this, it follows that 


T(9.)T(9;') = T($- )T(0-) 2I [5] 
T(¢_)T(¢4) = T(¢) (6) 
and 
Paf (4) = PT (61) Px [7] 
P,T(9-)Pn = T(o_)Pn 
This yields 
D, (ó) = det T,(ó) = det P,T(ó)P,, [8 
= det P T ($+ )T (7 )T(9)T(ó-! )T(ó-)P, [9] 


=detP,,T(¢+)PnT(¢;,')T(¢)T(¢—')PnT(b-)Pn [10] 


z = det P,T(ó,)P, det(P,T(¢,')T(¢)T(¢_')Pn) 
x det P, T(d_)P, [11] 


The determinants of the right-hand side and the left- 
hand side of the above expression are ((ġ+)ọ)”, 
respectively. Now given the Banach algebra condi- 
tions imposed on the symbol 6, it follows that the 
Operator 


T(¢;,')T(¢)T(¢-_') 


is of the form I + K, where K is trace class. Hence, 
the eigenvalues A; of K satisfy 


> Aj < OO 


and the infinite (Fredholm) determinant of I + K is 
defined. To verify the claim that the operator 


TCH; )T(6)T(ó—) = T(6,')T(6-)T(6,)T(0') 


is I plus a trace class operator, we use the identity 


T(fg) — T(f)T(g) = H(f)H(&) [12] 
where H(f) has matrix form (fisjy+1)-p, and 
ae’) —g(e77). Our Banach algebra conditions 
show that if f is in B then the operator H(f) satisfies 
Dij la;^ < oo, where the aj are the matrix entries 
of the operator. Any operator satisfying this is called 
a Hilbert-Schmidt operator, and it is known that the 
product of two Hilbert-Schmidt is trace class. 
Applying the identity to 


T(¢,')T(¢_) 


shows that this operator is T(¢,'¢_) plus trace class. 
The operator 


T(6,+)T (647) 
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is thus T(¢,@_') plus trace class and one more 
application of the identity combined with the fact 
that trace class operators form an ideal yield the 
desired result. 

From the theory of infinite determinants, as 
n — oo, 


det P,T (6; ) T(9)T (9^ )P, [13] 
converges to 
det(T (6; )T(9)T(9-')) [14] 
At this point, we have proved that 
lim D,(9)/((-)9 (6«)9)" 
= lim det P,T(¢,')T(¢)T(o—') Pn 
= det(T(¢,')T(¢)T(¢_')) [15] 
It only remains to identify the constants. To see that 


G(¢) E (($—)9)" ((0..)9)" 


we note that 
| | 1 f id 
Glo) = exp( (logy) =exp(5- [log o(e')de) 


1 2r , 
= exp E ' (log p- (e?) + log ó(e^)de) 
= exp(log @_), exp(log ¢+)o = ($-)o(O+)o 


To compute the determinant of 


T(6,')T(6)T(07) 
we write 
det T($,")T(9)T (67") 
= det T(9,")T(6-0+)T (07) 
= det T(¢,')T(¢_)T(¢4)T(¢—) 
This last expression is the form 
eAePe- Ag? 
where 
A=-—T(log¢,) and B= T(logóo.) 
If AB — BA is trace class then 
deteAePe Ae B — eU (AB-BA) 
The operator AB — BA is 
—T(log ¢,)T (log $—) + T(log ¢_)T (log $+) 


which equals 


—T(log ¢,)T(log d_) + T((log ¢_) (log ¢,)) 


and, by the identity from eqn [12], becomes 
H (log ¢-)H (log +) 
It can be directly computed that 


tr(H (log p-)H (log $..)) 


equals 


00 
$ ksps_p 
R=] 


and the theorem is proved. 


Returning to the Ising model, one needs to 
compute the asymptotics of the determinants for 
the generating function 


ple!) = (ee es) 1/2 


(1 = aye?) (1 = ael?) 


The term G(¢)=1 and for k > 0 


1 (—a2®  —o* EE 
ks,s zii =a + 22 Em) 


4\ k k k 


from which it follows that 


lim D,(ó) = oe MA d i 


n—oo (1—ay amy 


Recalling the definition of a; and az yields 


1/4 
oa 11 == —— 
ss CU (sinh 2E, sinh 24E;)" 


or the spontaneous magnetization M as 


1 1/8 
M = | 1 -——_________, 
(sinh 2GE, sinh 28E;) 


In order for this computation to be valid, it was 
necessary for 0 < o» < 1, and by elementary com- 
putations one can show that this is equivalent to the 
inequality 


sinh 28E, sinh 2GE > 1 


Nonsmooth Symbols or T = Te 


A problem occurs in the analysis just outlined when 
the inequality 0 < a2 < 1 does not hold. There are 
two separate possibilities, a2 > 1 or a2 = 1. First, we 
consider the latter case. For fixed E, and E), this 
happens for exactly one fixed value of the constant 
B. — 1/kT; and the corresponding temperature Te is 
called the critical temperature. The “strong Szegó 
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limit theorem” does not apply since our generating 
function is of the form 


p. fiae eA 
a) A 


In 1968, Fisher and Hartwig raised a conjecture 
about D,(@) for nonsmooth œ which included the 
above example. They considered generating func- 
tions of the form 


R 
ple) = ple) [ | dagl) [17] 
j=l 


where 
Pa glet) = (2 — 2cos h) P0- 092m 


Ra > —1/2, and £ is not an integer. The function v 
is assumed to be a smooth function. Using the 
Fisher-Hartwig notation, the symbol of interest in 
the Ising model from eqn [16] can be written as 


wie") do,-1/2(e”) 
where 
=y 7 
ÓN — 1 
ae ) iu s = DE 


The conjecture of Fisher and Hartwig for general 
symbols of this type stated that 


Dp ($) ~ G(v)'m" E 


where 
R 
p= » (of = &) 
Hi 


and E* is a constant whose value they did not 
identify. The constant was later computed to be 


E*(¢) = E(w) 10 (eif) "Gi (ei) 0h 
x IT (1 — e00 (ast 8.) (8) 
l<sA#r<R 
jo] (1 + 2oj) 
where G(z) is the Barnes G-function satisfying 
G(1 +z) =T(2)G(2) 
and is defined by 


G(1 +2) = (22) ^e 12-72 


For the above factors, we normalize v so that the 
geometric mean is 1. Then we may assume that 
the factors v.,v. (v v,-—wv) are 1 at zero and 
infinity, respectively, and this defines the loga- 
rithms for the first product. The E(w) term is the 
constant in Szegó's theorem, and the argument of 
a term of the form (1 —e'%-%)) is taken between 
—7/2 and 7/2. 

In the case where R — 1, the conjecture is known 
to hold if Ra > —1/2 and the function b satisfies the 
conditions of Szegó's theorem and is infinitely 
differentiable. The theorem also has an extension 
to the case where Ra < —1/2, with 2a not an 
integer, as long as the Fourier coefficients are 
defined as the coefficients of a distribution. 

If we apply the theorem to the generating function 
from [16] 


v(e^)óo. 1/5 (e^) = ES 
we see that the asymptotic expansion is given by 
-— 1/4 
nA (m G(1/2)G(3/2) 
1 — 01 
at the critical 


This last formula shows that, 
temperature, 


lim (00,000,4) a lim D,(o) =0 


thus, M=0, and hence there is no correlation 
between distant lattice points. 

It should be remarked here that the diagonal 
correlation at the critical temperature is also given 
by a singular Toeplitz determinant, 


(00095) = D5(óo. 1/2) ~ n ^G(1/2)G(3/2) 


and thus this limit is also zero. 

The proof of the Fisher-Hartwig conjecture is 
much more complicated than the proof of the 
“strong Szegó limit theorem." For an indication of 
how it is proved, note that if we consider the 
generating function óo,5, the Fourier coefficients 
are (sinz8)/[n(n —8)] and hence the matrix is 
Cauchy and the determinant can be computed 
exactly. From this the asymptotics can be derived 
and they yield a special case of the Fisher-Hartwig 
conjecture. The main idea in extending the result to 
a symbol of the form 


V(e^)óo, (e) 
is to prove that the limit of 


D, (60.5) 
D, (V)D4 (0,3) 


exists. The proof uses much of the same trace-class 
approach used in proving the “strong Szegó limit 
theorem,” although the results are more compli- 
cated. These ideas are then extended for R > 1 and 
also more general 3 and a. 

It should be noted that in this article the Fisher- 
Hartwig conjecture does not always hold. If we 
consider the function 


¡By —1, —JT € 0-0 
(e") = lL. 0<0<r 
then 
Se 0, if k is even 
"| —2i/(rk), if k is odd 


The matrix T,(ó) is antisymmetric and, if n is odd, 
D,,(@)=0. If n is even, using elementary row and 
column operations, the determinant can be put in 
block form with each block of Cauchy type. The 
determinant can then be evaluated to find 


D,(ó) ~ (in ^K 


where K is a certain constant. 
It is instructive to note that 


= $0,1/2(e Deo .1/2(e a 
= $o, 1/2(6") $472 (e 9?) 


and thus that this particular symbol has two 
representations of the type given in [17] and each 
would give a different asymptotic expansion of the 
determinant if the conjecture were true for this set of 
parameters. Hence, it is clear that the conjecture 
must fail to hold in this case. 

However, this example indicates that there might 
be a generalization of the original conjecture of 
Fisher and Hartwig. If 


ple”) 


ha gle?) x Qa. B,» 


then 


R 
o= 4 | | baga 
j=1 


it is also the case that 


R 
p=Yy" lI Qa, 8, RN 


j=l 
where 


R R 


> n; = 0 and y* = v | [(7-e9)" 
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In the example above, (§;=1/2, 8; — —1/2, 
01 =0, 0 =T, nı = —1, and z;—1. The result for 
the counterexample, combined with what is known 
for the case of integer values of œ and 5, leads to the 
following generalized conjecture. Suppose 


ei?) oF 
^l Pak, pr 0) 


for some set of 
O(k) = Ei (a$? — (85? 


and 


indices k. Define 


. Let O= max, R(O(R)) 


K = {k| R(Q(k)) = Q} 


The generalized asymptotic formula is conjectured 


to be 
=> G(y* 


keK 


WE + o(|G(o)| n9) 

It may turn out that there is only one element in K 
and for these symbols there is a unique representa- 
tion that yields the highest power in the exponent of 
the asymptotic expansion. These are the symbols for 
which the original Fisher-Hartwig conjecture should 
be true and it is now confirmed in these cases. For 


example, the conjecture is known to hold for R > 1 
when |Ra,| < 1/2 and |RB,| < 1/2. 


Symbols with Nonzero Index or 7 > 7, 


The last possibility in computing the correlation 
asymptotics is the case where a > 1. Note that, for 


fixed E, and E», there is exactly one value of 
B — 1/kT where 


E, 1—22 
Q2 = Xy 1+2z — 


For values of T > Te, we have that the symbol 


E — oe) (1 — ae "T 


(1 — ae 9) (1 — aei?) 


is the same as 


sol (1 — 01e?)(1— (1/o2)e^) 1/2 
i ance 


with the argument chosen so that the symbol is 
positive at m. Except for the extra factor of e”, this 
is the same type of smooth symbol that was 
considered earlier (see the section “Strong Szegó 
limit theorem"). However, a factor of e" can change 
the asymptotics considerably as can be seen by 
considering the simple example of the ¢=1. 
Fortunately, a variation of the Szegó theorem, first 
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considered by Fisher and Hartwig, holds for this 
case of smooth, nonvanishing index. 


Theorem 2 Suppose that 6=¢ ó, satisfies the 
condition of the “strong Szegó limit theorem” and 
in addition is at least once continuously differenti- 


able. Then, if b=o_¢,' and c=$"'6, 


Dyle) 
~ (71) 1" GG)" E($)G(c)" [18] 
be + De 
x | det : T + O(n-?) 
boim ==" 5g 
x (1+ O(n-")) [19] 


Applying this to the symbol 
duih _ io (1 — o1e")(1 — (1/o2)e") 
uii: d en (d — ae 9)(1— (1/a;)e-i9) 
we have that m= 1, G(d) = —1, G(c) = —1, and 


E(ó) = [ -ap(1-3)(1- 3 20 


The determinant in the above formula is the 


constant 


Lope ið L » 
b, === (1 — œe") {| 1——e 
2r 0 a2 


| L «ii .. 
x(1— nie ^ (1 - 29) e "dg 


Q2 


The last integral can be deformed to a segment of 
the real line and evaluated asymptotically to find 
that the leading term is 


1 01 1 A 
lU E -aoa (1 -3)) 


, I + 1/2) 
D(n+ 1) 


Putting this together with the above constants, we 
have, for T > Te, 


(09,070,n) 


"— Y (1—92)"* 1-4) "aa T jue 
Vno i a$ iii 


This implies that the correlation tends to zero very 
rapidly as n — oo. 


Further Remarks 


The interaction between statistical mechanics and 
the theory of Toeplitz determinants has a long 
history, and much of the motivation to describe the 
asymptotics of the determinants was spurred by the 
question of spontaneous magnetization in the two- 
dimensional Ising model. The previous three sections 
attempt to show how the very different physical 
situations — T<T.,T=T., and T» T, - all 
correspond to very different behavior in the symbols 
of the generating functions. Critical systems predict 
qualitatively different Szegó type theorems. For 
example, the phase transition at Te predicts that 
the asymptotics for singular symbols cannot be 
predicted by the smooth symbols, that is, one cannot 
use continuous functions to approximate the results 
for singular symbols. 

Onsager (1971) was the first to understand that 
the correlation function could be expressed as a 
Toeplitz determinant. This was made explicit by 
Montroll et al. (1963). For more information about 
the Ising model, the reader is referred to McCoy 
and Wu (1973), where a clear and complete 
description of the Ising model (and most of the 
notation used here in reference to this model) can 
be found. 

Szegó (1915, 1952) had originally proved a weak 
form of the “limit” theorem and he understood that 
it was desirable to extend to a second-order term. 
Szegó first proved the “strong Szegó limit theorem” 
for positive generating functions and this was later 
extended to the nonpositive case. 

The first to understand that a different asymptotic 
behavior was expected at the critical temperature 
was Fisher and this resulted in the conjecture for the 
class of determinants generated by what is now 
known as Fisher-Hartwig symbols (Fisher and 
Hartwig 1968). Progress on the conjecture was 
made by many authors. Bóttcher and Silbermann 
(1998) have provided general results concerning 
Toeplitz operators and determinants. Additional 
information about the conjectures of Fisher and 
Hartwig can be found in Bottcher and Silbermann 
(1990, 1998), Ehrhardt (2001), and Ehrhardt and 
Silbermann (1997). 

Toeplitz determinants are also important in many 
other applications. One more recent area of interest 
is the connection between random-matrix theory 
and Toeplitz determinants. Many statistical quanti- 
ties for the circular unitary ensemble can be 
described as a Toeplitz determinant. For example, 
the probability of finding no eigenvalues in an 
interval can be expressed as a Toeplitz determinant. 
It is also the case that many of the most interesting 


statistics correspond to singular symbols. For basic 
random-matrix theory information see Mehta 
(1991), and for connections between the circular 
unitary ensemble and Toeplitz determinants, 
see Hughes (2001), Tracy and Widom (1993), and 
Widom (1994). 


See also: Integrable Systems in Random Matrix Theory; 
Two-Dimensional Ising Model. 
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Basic Structure 


The origins of Tomita-Takesaki modular theory lie 
in two unpublished papers of M Tomita in 1967 and 
a slim volume by Takesaki (1970). It has developed 
into one of the most important tools in the theory of 
operator algebras and has found many applications 
in mathematical physics. — 

Although the modular theory has been formulated 
in a more general setting, it will be presented in the 
form in which it most often finds application in 
mathematical physics (for generalizations, details, 
and further references concerning the material 
covered in this article, the reader is referred to the 
Further Reading section). Let M be a von Neumann 
algebra on a Hilbert space H containing a vector (2 
which is cyclic and separating for M. Define the 
operator So on H as follows: 


SoAQ = A'Q, for all A € M 


This operator extends to a closed antilinear operator 
S defined on a dense subset of H. Let A be the 
unique positive, self-adjoint operator and J the 
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unique antiunitary operator occurring in the polar 
decomposition 


S = JA}? da ASI 


A 1s called the modular operator and / the modular 
conjugation (or modular involution) associated with 
the pair (M, Q). Note that J? is the identity operator 
and / =J*. Moreover, the spectral calculus may be 
applied to A so that A" is a unitary operator for 
each ¿€ R and (A"|t € R} forms a strongly con- 
tinuous unitary group. Let M’ denote the set of all 
bounded linear operators on H which commute with 
all elements of M. The modular theory begins with 
the following remarkable theorem. 


Theorem 1 Let M be a von Neumann algebra 
with a cyclic and separating vector Q. Then 
JQ=Q= AQ, and the following equalities bold: 


JM] = M' 
and 
A" MA^ =M, forallteR 


Note that if one defines Fy A’Q = A", for all A’ € 
M’, and takes its closure F, then one has the relations 


A=FS, A?*=SF, F=]JA“P 
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Modular Automorphism Group 


By Theorem 1, the unitaries A",7 € R, induce a one- 

parameter automorphism group {o+} of M by 
o(A)=A"AA", AEM, tER 

This group is called the modular automorphism 

group of M (relative to 2). Let w denote the faithful 

normal state on M induced by €): 


w(A) 2 —L-(Q,AQ, AEM 


Il 


From Theorem 1 it follows that w is invariant under 
{oz}, that is, w(o,(A)) =w(A) for all A € M and t € R. 

The modular automorphism group contains infor- 
mation about both M and w. For example, the 
modular automorphism group is an inner auto- 
morphism on M if and only if M is semifinite. It is 
trivial if and only if w is a tracial state on M. Indeed, 
for any B € M, one has o;(B) =B for all t € R if and 
only if w(AB)=u(BA) for all A €.M. Let M” 
denote the set of all such B in M. 


The KMS Condition 


The modular automorphism group satisfies a condi- 
tion which had already been used in mathematical 
physics to characterize equilibrium temperature 
states of quantum systems in statistical mechanics 
and field theory — the Kubo-Martin-Schwinger 
(KMS) condition. If M is a von Neumann algebra 
and {a;|t€ R] is a o-weakly continuous one- 
parameter group of automorphisms of M, then the 
state ó on M satisfies the KMS condition at (inverse 
temperature) 8 (0 < 8 < oc) with respect to [o;] if 
for any A,B € M there exists a complex function 
Fap(z) which is analytic on the strip {z€ C|0 < 
Imz < 8} and continuous on the closure of this strip 
such that 


FA p(t) = ó(o;(A)B) 
Fa p(t +18) = p(Ba¿(A)) 


for all t € R. In this case, ó(0;5(A)B) = (BA), for all 
A, B in a o-weakly dense, a-invariant *-subalgebra 
of M. Such KMS states are a-invariant, that is, 
ó(o$(A)) = (A), for all A € M,t € R, and are stable 
and passive (cf. Bratteli and Robinson (1981) and 
Haag (1992)). 

Every faithful normal state satisfies the KMS 
condition at G=1 (henceforth called the modular 
condition) with respect to the corresponding mod- 
ular automorphism group. 


Theorem 2 Let M be a von Neumann algebra 
with a cyclic and separating vector NQ. Then the 
induced state w on M satisfies the modular condi- 
tion with respect to the modular automorphism 
group {o,|t € R} associated to the pair (.Mt, €). 


The modular automorphism group is, therefore, 
endowed with the analyticity associated with the 
KMS condition, and this is a powerful tool in 
many applications of the modular theory to 
mathematical physics. In addition, the physical 
properties and interpretations of KMS states are 
often invoked when applying modular theory to 
quantum physics. 

Note that while the nontriviality of the modular 
automorphism group gives a measure of the non- 
tracial nature of the state, the KMS condition for the 
modular automorphism group provides the missing 
link between the values w(AB) and w(BA), for all 
A,B € M (hence the use of the term “modular,” as 
in the theory of integration on locally compact 
groups). 

The modular condition is quite restrictive. Only 
the modular group can satisfy the modular condition 
for (M, Q), and the modular group for one state can 
satisfy the modular condition only in states differing 
from the original state by the action of an element in 
the center of M. 


Theorem 3 Let M be a von Neumann algebra 
with a cyclic and separating vector Q, and let {o;} 
be tbe corresponding modular | automorpbism 
group. If the induced state w satisfies the modular 
condition with respect to a group {a,;} of auto- 
morphisms of M, then {a,} must coincide with {o;}. 
Moreover, a normal state y on M satisfies the 
modular condition with respect to [o,) if and only 
if w(-)=w(h-)=w(h'/* bp?) for some unique 
positive injective operator b affiliated with the 
center of M. 


Hence, if M is a factor, two distinct states cannot 
share the same modular automorphism group. The 
relation between the modular automorphism groups 
for two different states will be described in more 
detail. 


One Algebra and Two States 


Consider a von Neumann algebra M with two 
cyclic and separating vectors (2 and ®, and denote 
by w and @, respectively, the induced states on M. 
Let (o7) and (o7) denote the corresponding modular 
groups. There is a general relation between the 
modular automorphism groups of these states. 


Theorem 4 There exists a o-strongly continuous 
map R Ə t U, € M such that 


(i) U, is unitary for all t € R; 
(ii) Us = U;o?(U,) for all st € R; and 
(iii) of (A) = U,0%(A)U,* for all A € M and t ER. 


The 1-cocycle {U;} is commonly called the cocycle 
derivative of @ with respect to w and one writes 
U,—(Dó:Dw). There is a chain rule for this 
derivative, as well: If d,w, and p are faithful normal 
states on M, then (Dv: D$), = (Dv: Dp),(Dp: D$), 
for all + € R. More can be said about the cocycle 
derivative if the states satisfy any of the conditions 
in the following theorem. 


Theorem 5 The following conditions are 


equivalent: 


(1) @ is {o} }-invariant; 

(ii) w is (o7 ]-invariant; 

(iii) there exists a unique positive injective operator 
b affiliated with M” N M” such that w(-) = 
olh -) 2 o(b ^ . ht>); 

(iv) there exists a unique positive injective operator 
b! affiliated with M" N M” such that ¢(-) = 
"UI . )=w(h!!/2 . p''/2), 

(v) the norms of the linear functionals w + id and 
w — id are equal; and 

(vi) oo? — c?c*, for all s,t € R. 


The conditions in Theorem 5 turn out to be 
equivalent to the cocycle derivative being a 
representation. 


Theorem 6 The cocycle {U;} intertwining o7) with 
(o?) is a group representation of the additive group 
of reals if and only if $ and w satisfy the conditions 
in Theorem 5. In that case, U(t) =h". 


The operator h’ — b^! in Theorem $ is called the 
Radon-Nikodym derivative of @ with respect to w 
(often denoted by dó/du), due to the following 
result, which, if the algebra .M is abelian, is the 
well-known Radon-Nikodym theorem from mea- 
sure theory. 


Theorem 7 If ó and w are normal positive linear 
functionals on M such that $(A) € oA), for all 
positive elements A € M, then there exists a unique 
element b? € M such that dl - ) 2 w(b V? - b?) and 
0 <h! « 1. 


The analogies with measure theory are not 
accidental, although these are not discussed in detail 
here. Indeed, any normal trace on a (finite) von 
Neumann algebra .M gives rise to a noncommuta- 
tive integration theory in a natural manner. Mod- 
ular theory affords an extension of this theory to the 
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setting of faithful normal functionals 7 on von 
Neumann algebras M of any type, enabling the 
definition of noncommutative L? spaces, L*(M, n). 


Modular Invariants and the Classification 
of von Neumann Algebras 


As already mentioned, the modular structure carries 
information about the algebra. This is best evi- 
denced in the structure of type III factors. As this 
theory is rather involved, only a sketch of some of 
the results can be given. 

If M is a type III algebra, then its crossed 
product A —.M x. R relative to the modular 
automorphism group of any faithful normal state 
w on M is a type II, algebra with a faithful 
semifinite normal trace 7 such that 70o0,—e ‘7, 
t € R, where 0 is the dual of o” on N. Moreover, 
the algebra M is isomorphic to the cross product 
N x R, and this decomposition is unique in a very 
strong sense. This structure theorem entails the 
existence of important algebraic invariants for M, 
which has many consequences, one of which is made 
explicit here. 

If w is a faithful normal state of a von Neumann 
algebra M induced by Q, let A,, denote the modular 
operator associated to (M, Q) and sp A, denote the 
spectrum of A,. The intersection 


SIM) =N sp A, 


over all faithful normal states w of M is an algebraic 
invariant of M. 


Theorem 8 Let M be a factor acting on a 
separable Hilbert space. If M is of type II, then 
0 € S'(M); otherwise, S'(M) = {0,1} if M is of type 
I; or Il,, and S'(M)={1} if not. Let M now be a 
factor of type UI. 


(i) M is of type II,0 € A « 1, if and only if 
S'(M) 2 (0) u (A" | n € Z}. 
(ii) M is of type Mio if and only if S'(Mt) = (0, 1]. 
(iii) M is of type III, if and only if S'(M) = [0, oc). 


In certain physically relevant situations, the 
spectra of the modular operators of all faithful 
normal states coincide, so that Theorem 8 entails 
that it suffices to compute the spectrum of any 
conveniently chosen modular operator in order to 
determine the type of M. In other such situations, 
there are distinguished states w such that 
S'(.M) —sp A,. One such example is provided by 
asymptotically abelian systems. A von Neumann 
algebra M is said to be “asymptotically abelian” if 
there exists a sequence {a,,},,-); of automorphisms of 
M such that the limit of {Aa,(B) — an(B)Al,en in 
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the strong operator topology is zero, for all A, B € 
M. If the state w is a,,-invariant, for all n € N, then 
sp A, is contained in sp Ag, for all faithful normal 
states ó on M, so that S'(.M) — sp A... If, moreover, 
sp A, =[0,00), then spA,—spA,, for all $ as 
described. 


Self-Dual Cones 


Let j: M — M' denote the antilinear *-isomorphism 
defined by j(A)=JAJ,A € M. The natural positive 
cone P* associated with the pair (M, (2) is defined as 
the closure, in H, of the set of vectors 


(AjCA)O | A € M} 


Let M, denote the set of all positive elements of M. 
The following theorem collects the main attributes 
of the natural cone. 


Theorem 9 


(i) P* coincides with the closure in H of the set 
(AMAN|A € My). 

(ii) A*P* =P for all t € R. 

(ii) J& — 9 for all DE P. 

(iv) Aj(A)P? c P for all A € M. 

(v) P* is a pointed, self-dual cone wbose linear 
span coincides with H. 

(vi) If 9 € P*, then 9 is cyclic for M if and only if 
® is separating for M. 

(vii) If ® € P is cyclic, and hence separating, for 
M, then the modular conjugation and the 
natural cone associated with the pair (M,®) 
coincide with ] and P^, respectively. 

(viii) For every normal positive linear functional à 
on M, there exists a unique vector ®, c P* 
such that Q(A) — ($5, A®,) for all A € M. 


In fact, the algebras M and M’ are uniquely 
characterized by the natural cone P’ [4]. In light of 
(viii), if œ is an automorphism of M, then 


V(a)®, _ AP. 


defines an isometric operator on P*, which by (v) 
extends to a unitary operator on H. The map 
a—V(a) defines a unitary representation of the 
group of automorphisms Aut(M) on M in such a 
manner that V(a)AV(a) ! =a(A) for all A € M and 
a € Aut(M). Indeed, one has the following: 


Theorem 10 Let M be a von Neumann algebra 
with a cyclic and separating vector €). The group V 
of all unitaries V satisfying 


VMV*=M, VJV*=J, VP? =P 


is isomorphic to Aut(M) under the above map 
a — Vía), which is called the “standard implemen- 
tation" of Aut(M). 


Often of particular physical interest are (anti-)auto- 
morphisms of M leaving w invariant. They can only 
be implemented by (anti)unitaries which leave 
the pair (.Mt, Q) invariant. In fact, if U is a unitary 
or antiunitary operator satisfying UQ=Q and 
U.MU* =M, then U commutes with both J and A. 


Two Algebras and One State 


Motivated by applications to quantum field theory, 
the study of the modular structures associated with 
one state and more than one von Neumann algebra 
has begun (see Borchers (2000) for references and 
details). Let M CM be von Neumann algebras 
with a common cyclic and separating vector €), 
and Ay, Jy and Ay,Ju denote the corresponding 
modular objects. The structure (M, M, Q) is called 
a +-half-sided modular inclusion if ANA C 
N, for all +¢ > 0. 


Theorem 11 Let M be a von Neumann algebra 
with cyclic and separating vector Q. The following 
are equivalent: 


(i) There exists a proper subalgebra N C M such that 
(M,N, Q) is a +-half-sided modular inclusion. 

(ii) There exists a unitary group {U(t)} with positive 
generator such that 


U(t)MU(t) ! C.M, for all £t 2 0, 
U(t)0 2Q, forallt cR 


Moreover, if these conditions are satisfied, then the 
following relations must hold: 

AX USA = AÑU(SA SE = Utes) 
and 


ImU(s)Jm = In Uts))y = U(=s) 


for all s,t € R. In addition, N =U(+1)MU(+1)"', 
and if M is a factor, it must be type MI. 


The richness of this structure is further suggested 
by the next theorem. 


Theorem 12 


(i) Let (M,N 4, Q) and (M, N2, Q0) be —-balf-sided, 
resp. +-half-sided, modular inclusions satisfy- 
ing the condition ]y,Jy,=JuInv.Jn,Jm- Then 
the modular unitaries A^, A. : AN. sS; tu €R, 
generate a faithful continuous unitary repre- 
sentation of the identity component of the 


group of isometries of two-dimensional Min- 
kowski space. 

(ii) Let M,N,NOM be von Neumann algebras 
with a common cyclic and separating vector Q. If 
(M,MNN,Q) and (N, MNN,Q) are —-half- 
sided, resp. +-half-sided, modular inclusions such 
that ]vMJyv=M, then the modular unitaries 
AN AK, AN us tu ER, generate a faithful 
continuous unitary representation of SL 
(2, R)/Za. 


This has led to a further useful notion. If VW C M 
and Q is cyclic for VM M, then (M, N,Q) is said to 
be a “+-modular intersection" if both (M,M NN, Q) 
and (N,MNN,Q) are +-half-sided modular inclu- 


sions and 
wl lim A" Av!’ liar = lim ASA. 
In | ^o NA Ja a 


where the existence of the strong operator limits is 
assured by the preceding assumptions. An example 
of the utility of this structure is the following 
theorem. 


Theorem 13 Let N,M,L be von Neumann alge- 
bras with a common cyclic and separating vector £). If 
(M,N, Q) and (N^, L, Q) are —modular intersections 
and (M,£,Q) is a +-modular intersection, then the 
unitaries AT, AX, A, s, t, u € R, generate a faithful 
continuous unitary representation of SO! (1, 2). 


These results and their extensions to larger 
numbers of algebras were developed for application 
in algebraic quantum field theory, but one may 
anticipate that half-sided modular inclusions will 
find wider use. Modular theory has also been 
applied fruitfully in the theory of inclusions V C M 
of properly infinite algebras with finite or infinite 
index. 


Applications in Quantum Theory 


The Tomita-Takesaki theory has found many 
applications in quantum field theory and quantum 
statistical mechanics. As mentioned earlier, the 
modular automorphism group satisfies the KMS 
condition, a property of physical significance in the 
quantum theory of many-particle systems, which 
includes quantum statistical mechanics and quantum 
field theory. In such settings, for a suitable algebra 
of observables M and state w, an automorphism 
group {og} representing the time evolution of the 
system satisfies the modular condition. Hence, on 
the one hand, {oz} is the modular automorphism 
group of the pair (M, Q), and, on the other, w is an 
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equilibrium state at inverse temperature 8, with all 
the consequences which both of these facts have. 
But it has become increasingly clear that the 
modular objects A", J, of certain algebras of 
observables and states encode additional physical 
information. In 1975, it was discovered that if one 
considers the algebras of observables associated with 
a finite-component quantum field theory satisfying 
the Wightman axioms, then the modular objects 
associated with the vacuum state and algebras of 
observables localized in certain wedge-shaped 
regions in Minkowski space have geometric content. 
In fact, the unitary group (A"] implements the group 
of Lorentz boosts leaving the wedge region invariant 
(this property is now called modular covariance), 
and the modular involution / implements the space- 
time reflection about the edge of the wedge, along 
with a charge conjugation. This discovery caused 
some intense research activity (see Baumgartel and 
Wollenberg 1992, Borchers 2000, Haag 1992). 


Positive Energy 


In quantum physics the time development of the 
system is often represented by a strongly continuous 
group {U(t)=e|t € R} of unitary operators, and 
the generator H is interpreted as the total energy of 
the system. There is a link between modular 
structure and positive energy, which has found 
many applications in quantum field theory. This 
result was crucial in the development of Theorem 11 
and was motivated by the 1975 discovery mentioned 
above, now commonly called the Bisognano- 
Wichmann theorem. 


Theorem 14 Let M be a von Neumann algebra 
with a cyclic and separating vector Q, and let (U(t)) 
be a continuous unitary group satisfying U(t)MU 
(—t) C.M, for all t 2 0. Then any two of the 
following conditions imply the third: 


(i) U(t)=e"", with H > 0; 
(ii) U(t)Q=Q, for all t € R; and 
(iii) A"U(s)A^" = U(e"'s) and JU(s)] = U(—s), for 
all s,t € R. 


Modular Nuclearity and Phase Space Properties 


Modular theory can be used to express physically 
meaningful properties of quantum “phase spaces" 
by a condition of compactness or nuclearity of 
certain maps. In its initial form, the condition was 
formulated in terms of the Hamiltonian, the global 
energy operator of theories in Minkowski space. 
The above indications that the modular operators 
carry information about the energy of the system 
were reinforced when it was shown that a 
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formulation in terms of modular operators was 
essentially equivalent. 

Let O, C O2 be nonempty bounded open subregions 
of Minkowski space with corresponding algebras of 
observables .A(O4) C .A(O5) in a vacuum representa- 
tion with vacuum vector 92, and let A be the modular 
operator associated with (A(Q2),Q) (by the Reeh- 
Schlieder theorem, 2 is cyclic and separating for 
A(O2)). For each A € (0,1/2) define the mapping 
E :4(01) — H by =\(A) = A*AQ. The compactness 
of any one of these mappings implies the compactness 
of all of the others. Moreover, the /? (nuclear) norms of 
these mappings are interrelated and provide a measure 
of the number of local degrees of freedom of the 
system. Suitable conditions on the maps in terms of 
these norms entail the strong statistical independence 
condition called the split property. Conversely, the split 
property implies the compactness of all of these maps. 
Moreover, the existence of equilibrium temperature 
states on the global algebra of observables can be 
derived from suitable conditions on these norms in the 
vacuum sector. 

The conceptual advantage of the modular com- 
pactness and nuclearity conditions compared to 
their original Hamiltonian form lies in the fact that 
they are meaningful also for quantum systems in 
curved spacetimes, where global energy operators 
(i.e., generators corresponding to global timelike 
Killing vector fields) need not exist. 


Modular Position and Quantum Field Theory 


The characterization of the relative “geometric” 
position of algebras based on the notions of modular 
inclusion and modular intersection was directly 
motivated by the Bisognano-Wichmann theorem. 
Observable algebras associated with suitably chosen 
wedge regions in Minkowski space provided exam- 
ples whose essential structure could be abstracted 
for more general application, resulting in the notions 
presented in the preceding sections. 

Theorem 12(ii) has been used to construct, from 
two algebras and the indicated half-sided modular 
inclusions, a conformal quantum field theory on the 
circle (compactified light ray) with positive energy. 
Since the chiral part of a conformal quantum field 
model in two spacetime dimensions naturally yields 
such half-sided modular inclusions, studying the 
inclusions in Theorem 12(ii) is equivalent to study- 
ing such field theories. Theorems 12(i) and 13 
and their generalizations to inclusions involving up 
to six algebras have been employed to construct 
Poincaré-covariant nets of observable algebras (the 
algebraic form of quantum field theories) satisfying 
the spectrum condition on (d+ 1)-dimensional 


Minkowski space for d=1,2,3. Conversely, such 
quantum field theories naturally yield such systems 
of algebras. 

This intimate relation would seem to open up the 
possibility of constructing interacting quantum field 
theories from a limited number of modular inclu- 
sions/intersections. 


Geometric Modular Action 


The fact that the modular objects in quantum field 
theory associated with wedge-shaped regions and the 
vacuum state in Minkowski space have geometric 
significance (“geometric modular action”) was origin- 
ally discovered in the framework of the Wightman 
axioms. As an algebraic quantum field theory (AQFT) 
does not rely on the concept of Wightman fields, it was 
natural to ask (i) when does geometric modular action 
hold in AQFT and (i) which physically relevant 
consequences follow from this feature? 

There are two approaches to the study of 
geometric modular action. In the first, attention is 
focused on modular covariance, expressed in terms of 
the modular groups associated with wedge algebras 
and the vacuum state in Minkowski space. Modular 
covariance has been proven to obtain in conformally 
invariant AQFT, in any massive theory satisfying 
asymptotic completeness, and also in the presence of 
other, physically natural assumptions. To mention 
only three of its consequences, both the spin-statistics 
theorem and the PCT theorem, as well as the 
existence of a continuous unitary representation of 
the Poincaré group acting covariantly upon the 
observable algebras and satisfying the spectrum 
condition follow from modular covariance. 

In a second approach to geometric modular action, 
the modular involutions are the primary focus. Here, 
no a priori connection between the modular objects 
and isometries of the spacetime is assumed. The central 
assumption, given the state vector Q and the von 
Neumann algebras of localized observables (.A(O)] on 
the spacetime, is that there exists a family W of subsets 
of the spacetime such that Jw,R(W2)/w, € 
{R(W)| W € W}, for every W1,W2 € W. This condi- 
tion makes no explicit appeal to isometries or other 
special attributes and is thus applicable, in principle, to 
quantum field theories on general curved spacetimes. 

It has been shown for certain spacetimes, including 
Minkowski space, that under certain additional 
technical assumptions, the modular involutions 
encode enough information to determine the 
dynamics of the theory, the isometry group of the 
spacetime, and a continuous unitary representation of 
the isometry group which acts covariantly upon the 
observables and leaves the state invariant. In certain 
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cases including Minkowski space, it is even possible 
to derive the spacetime itself from the group J 
generated by the modular involutions {Jw | W € W}. 

The modular unitaries Al, enter in this approach 
through a condition which is designed to assure the 
stability of the theory, namely that Al, € J, for all 
t € R and W € W. In Minkowski space, this addi- 
tional condition entails that the derived representation 
of the Poincaré group satisfies the spectrum condition. 


Further Applications 


As previously observed, through the close connec- 
tion to the KMS condition, modular theory enters 
naturally into the equilibrium thermodynamics of 
many-body systems. But in recent work on the 
theory of nonequilibrium thermodynamics it also 
plays a role in making mathematical sense of the 
notion of quantum systems in local thermodynamic 
equilibrium. Modular theory has also proved to be 
of utility in recent developments in the theory of 
superselection rules and their attendant sectors, 
charges and charge-carrying fields. 


See also: Algebraic Approach to Quantum Field Theory; 
Axiomatic Quantum Field Theory; Quantum Central-Limit 
Theorems; Symmetries in Quantum Field Theory of 


Lower Spacetime Dimensions; Thermal Quantum Field 
Theory; Positive Maps on C*-Algebras; Two-Dimensional 
Models; von Neumann Algebras: Introduction, Modular 
Theory, and Classification Theory. 
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Introduction 


Symmetry-breaking phase transitions occur in a wide 
variety of systems — from condensed matter to the 
early universe. One of the common features of such 
transitions is the appearance, in the broken-symmetry 
phase, of topological defects, trapped regions in 
which the symmetry is restored, or at least changed. 
Examples are vortices in superfluids, domain walls in 
ferromagnets, and disclination lines in liquid crystals. 
Often these defects are stable for topological reasons, 
and play an important role in the dynamics of the 
system. An astonishingly rich variety of defects can be 
found in various systems. They can usefully be 
classified using the tools of homotopy theory. 


Spontaneous Symmetry Breaking 


Let us consider a quantum-mechanical system with a 
symmetry group G. This means that each g € G is 


represented on the Hilbert space of quantum states 
by a unitary operator U(g), which commutes with 
the Hamiltonian. Spontaneous symmetry breaking 
occurs if this symmetry is not shared by the ground 
state or vacuum state |0) of the system. In other 
words, for some g€ G, Û(g)|0) Z |0). Then the 
ground state is necessarily degenerate: Ü(g)|0) must 
have the same energy as |0). 

Spontaneous symmetry breaking is usually 
describable in terms of an order-parameter field, 
which vanishes above the transition and is nonzero 
below it. We can find a scalar field ó(r), or multiplet 
of fields 6=(¢,;,i=1,...,n) transforming according 
to some representation D of G (assumed not to 
contain the trivial representation), whose expecta- 
tion value in the ground state is nonzero: 


(OJó(r)10) = do 7 0 [1] 
This is the order parameter. Since 
(0 (g)ó(r) U(g)|O) = D(g)óo 2 


it follows that the only elements of G that can be 
symmetries of the ground state are those in the 
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stability subgroup H of do (the group of unbroken 
symmetries in this ground state): 


H = {g € G: D(g)00 = do) E 


In terms of this subgroup, we can find a useful 
characterization of the manifold M of degenerate 
ground states. As noted above, for each g €G, 
Ü(g)|0) is also a ground state. However, these are 
not all distinct, because clearly U(gh)|0) = Ü(g)|0) 
for all p € H. Hence, the distinct ground states are 
in one-to-one correspondence with the left cosets gH 
of H in G, and M may be identified with the 
quotient space G/H, the space of left cosets. 

For example, suppose G is the rotation group 
SO(3), and @ belongs to the three-dimensional 
vector representation. If à 4 0 in the ground state, 
we may choose @,=(0,0,v). Then, clearly, 
H — SO(2), the group of rotations about the z-axis, 
and M — SO(3)/SO(2) = S^, the 2-sphere. It is useful 
to think of M as the subset of the order-parameter 
space comprising the possible expectation values 
9 = (6) for the various degenerate ground states. For 
example, in this case, M = (4: $^ —?]. 


Defect Formation 


It is often possible to characterize the dynamics at 
finite temperature in terms of a function of the order 
parameter, the effective potential V(ó), which is 
necessarily invariant under G, and whose minima 
define the equilibrium states. At low temperatures, it 
has a form like V — A(9? — ?)?, whose minima 
occur at nonzero values of @. But above the critical 
temperature Te, the only minimum is at ø = 0, so the 
equilibrium state is symmetric under G. In the high- 
temperature phase, there may be large fluctuations 
in @, but its mean value will be zero. 

Now, when the system is cooled through the 
phase transition, ¢ will acquire a nonzero expecta- 
tion value, gradually approaching one of the 
degenerate ground states characterized by a point 
of M. But the choice of which one is unpredict- 
able; the symmetry breaking is spontaneous. 
Moreover, in a large system, there is no reason 
why the same choice should be made everywhere. 
For example, a ferromagnet cooling through its 
Curie point may acquire a spontaneous magneti- 
zation in different directions in different parts of 
the sample. 

Of course, there is an energetic penalty to having 
a spatially varying order parameter, so it will tend to 
become more uniform as the temperature is lowered. 
But the question arises whether there may be any 
topological obstruction to this process. It can 
happen that if we choose points on M in a 


continuous manner everywhere around the periph- 
ery of some region, it is topologically impossible to 
complete the process throughout its interior. 
Continuity may require that there are points where 
@ leaves the surface M. For example, if our 
ferromagnet has two opposite possible directions of 
easy magnetization, described by f, and —6ó,, then 
M consists essentially of these two points. Regions 
where @ = à, and where $ ~ —$ must be separated 
by domain walls across which $ varies smoothly 
from one to the other. 


Homotopy Groups 


To classify the various possible types of defect, we 
need to consider the homotopy groups of the 
manifold M of degenerate ground states. In this 
section, we briefly review the necessary definitions. 

A path in M is a map @:I — M from the unit 
interval I= [0, 1] C R. We choose a base point my € 
M (which may be identified with $5), and consider 
loops in M, paths such that $(0) — ó(1) — 71. We 
say that two loops are homotopic, and write ¢ ~ Y, 
if one can be continuously deformed into the other 
within M, that is, if there exists a map x: — M 
such that 


x(0, t) = ot) 
for all t, and 


and x(L0=w(0 A 


x(s, 0) = x(s, 1) = mo [5] 


for all s. This is an equivalence relation. The set 
T¡(M) is the set of equivalence classes [$] of loops 
under this relation. 

On the set of loops, we may define a product óv, 
comprising the loop ¢ followed by w (see Figure 1). 
Explicitly, 


2t), 0 « 
eno - UO it 
2 


v(2t 15. 


It is easy to show that if 6~ Y” and wv ~, then 
oy ~ $'i/. Hence, this defines a product on 71(M), 
by [ó][1] ^ [Gv]. So equipped, r¡(M) becomes the 


Figure 1 The product of loops. 
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fundamental group or first homotopy group of M. 
Note that the identity is the equivalence class [6o] of 
the trivial loop with ¢o(t) = mo, while the inverse is 
[ó] ! -[ó], where the map ó is the reverse of 
$: p(t) — o1 — t). 

Strictly speaking, we should write mı(M, mọ) in 
place of z4(.M). However, for any path-connected 
space, the groups mı(M, mo) and r¡(M,m;) are 
always isomorphic, and, more importantly, the same 
is true for any coset space M = G/H, where G is a 
Lie group and H a closed subgroup. For a general 
manifold M, mı( M) is not necessarily abelian, but it 
is so if M is a Lie group, or more generally a 
Riemannian symmetric space. The space M is said to 
be simply connected if 7; (M) = 0, the group compris- 
ing only the identity element, 0 = ([óo]]. (Although 
T¡(M) is not always abelian, it is conventional for 
homotopy groups to use an additive notation and 
represent the trivial group by O rather than 1.) 

The mth homotopy group T,(M) may be defined 
similarly, as a set of equivalence classes of maps 
@:I" — M such that ¢ maps the entire boundary OI" 
to the base point 7:9. Two such maps are homotopic 
(Ó ~ w) if there exists a map x:1"+! — M such that 


x(0,£) — ó(t) and x(1,t) = v(t) [7] 


for all t= (t1; .. ta), and, for each s € I, x(s, t) =mo 
for all + € OJ”. The product óv is defined by 


(Ów)(ti, -ea tn) 


Nu IM 
W(2t) —1,t2,...,tn), 4<t <1 


The choice of tı rather than any other t; is arbitrary; 
all choices yield homotopic product maps. The 
product again defines a product on 7,,(M), which 
thereby becomes a group, the zth homotopy group. 
One new feature is that, for all n > 1,7,,(M) is 
always abelian. | 

Note that since the entire boundary of I, is 
mapped to a single point, it is possible to collapse it, 
and talk instead about maps from the n-sphere S$” to 
M, taking one designated point to mo. The fact that 
Ta( M) is nontrivial indicates the existence in M of 
closed n-surfaces that cannot be smoothly shrunk to 
a point. In particular, it is worth noting that, for any n, 
T,(S") — Z, the additive group of integers, while 
T5") 20 for allm < m. 

A special case is n=0. Here, $ comprises two 
points only, and since one of them is always mapped 
to mo, we really have to consider maps from a single 
point to M, that is, points in M. Two points are 
homotopic if they can be joined by a path in M. 
Thus, ro(M) may be identified with the set of path- 
connected components of M. Note, however, that in 


general no product can be defined on mo( M), so 
710(.Mt) should be called the zeroth homotopy set 
(not group). There is an important exception, 
however: if G is a Lie group, and Go its connected 
subgroup (the subset of elements joined by paths to 
the identity e), then ro(M) may be identified with 
the quotient group G/Gy. Note, however, that this 
group ro(M) — G/Go is not necessarily abelian. 


Classification of Defects 


We now turn to the classification of defects by 
means of homotopy groups. It will be useful to start 
with simple specific examples in three-dimensional 
space, R?. 

First, suppose again that @ belongs to the vector 
representation of G—SO(3). Then M=SO(3)/ 
SO(2)=S? may be identified with the sphere 
M = (9: 9^ —1?] in à space. Consider a closed surface 
S, an embedding of a 2-sphere S* in R?. Assume 
that everywhere on S the field ó(r) has one of the 
ground-state values. In other words, we have a map 
g:S — M, from one 2-sphere to another. The map @ 
can be extended to a map from the interior of S to M 
only if it belongs to the trivial homotopy class [64] € 
TM), where ġo: I^ — M: (ti, t2) — mo — eH. In all 
other cases, there must be at least one point where 
Q(r) — 0; this is a point defect. The second homotopy 
group in this case is v2(S^) — Z, so the possible 
point defects, or monopoles, are labeled by an integer 
n € Z, the winding number. (An example of a map 
with winding number n is (in spherical polars) 
(r, 0, p) + (v, 0, ny).) 

More generally, point defects in R^ are classified 
by 74 1(M). A map ó from a closed (d - 1)- 
dimensional surface S C R? to M can be extended 
to the interior of S if and only if it belongs to the 
trivial homotopy class [ġo] € 74 4(.Mt). If this is not 
the case, there must be at least one point around 
which ¢(r) leaves the surface M, although in general 
it is not required to vanish anywhere. 

Second, take the case where ó is a single complex 
field, and G is the phase symmetry group U(1). In 
this case, H is the subgroup 1={1} C G. Thus, 
M —U(1)/1— S; this manifold may be identified 
with the circle {¢:|¢|=v} in the order-parameter 
space. Now consider a closed loop C in space, an 
embedding of S! in R? (see Figure 2). Suppose that 
on C,¢(r) takes one of the ground-state values, 
say ó(r)—vexp[io(r)]. If S is some surface with 
boundary C, then the map ¢:C — M can be 
extended to a map 4:S — M if and only if it 
belongs to the trivial homotopy class [ġo] € mı( M). 
If it does not, then there must be at least one 
point on S within C where ¢=0. Moreover, this 
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Figure 2 A linear defect. 


must be true of every surface S spanning C, so there 
must be a curve passing through C along which 
$ — 0. This is a linear defect, a string or vortex line. 
In this case, the first homotopy group is z1($!) = Z, 
so we see that the possible linear defects are 
classified by an integer, the winding number n. An 
example of a map with winding number n is 
pm ve?, 

Again, this result can easily be generalized. Linear 
defects in R^ are classified by my 2(M). If, on a 
(d — 2)-dimensional surface C,ó(r) takes values in 
Mt, and if it does not belong to the trivial homotopy 
class, there must be a linear defect threading 
through C, around which ¢ leaves the surface M — 
although again it need not necessarily vanish. 

More generally yet, in the d-dimensional space R*, 
defects of dimension p are classified by the homotopy 
group 74.5 1(.M). For example, in three dimensions, 
planar defects — domain walls — are classified by 
To (ME). 


The Exact Sequence 


There are mathematical theorems that greatly 
facilitate the computation of the homotopy group 
of homogeneous spaces, of the form M = G/H. 

We begin with the maps relating these spaces to 
each other. There is a canonical injective homo- 
morphism ;: H — G:h++h, and a canonical pro- 
jection associating each element of G with its coset: 
p:G => M:grgH. Moreover, it is clear that the 
image of 7, namely the subgroup H, is also the kernel 
of p, the inverse image pmo of the distinguished 
element mọ — eH of M. These statements can be 
summarized by saying that 


i — H-565 M 1 


Figure 3 An exact sequence. 


is an exact sequence: the image of each map is the 
kernel of the following one (see Figure 3). 

Next, we note that since any closed loops (or 
n-surfaces) in H belonging to the same homotopy 
class are also homotopic as loops (or m-surfaces) in 
G, there is an induced homomorphism ¿,:7,(H) — 
Ta[G). Similarly, homotopic loops or n-surfaces in G 
project to homotopic loops or n-surfaces in M, so 
there is an induced homomorphism p,:7,(G) — 
Tn(M). Moreover, it is easy to see that although i, is 
not necessarily injective and p, not necessarily 
projective, it is true that the image of i, is the kernel 
of p+. For example, any loop in G will be mapped to 
a homotopically trivial loop in M if and only if it is 
homotopic to the image of a loop in H. 

In addition, there is a boundary map that 
relates homotopy groups of different dimension: 
OQ:T441(MD) — m4,(H). To see this, it is useful to 
think of G as a fiber bundle with base space M and 
fiber H. Now consider a map ó$:(I"*!, 9]"*!) > 
(M,mo). Since p is a projection, ó can always be 
lifted to a map 6: (I"*!, 9]"*!) — (G, H), that is, we 
can find a (nonunique) map ó such that ó—poó 
(see Figure 4). However, $ does not necessarily map 
the boundary to a single point; what is true is that ó 
must map the boundary to a subset of H, and since 
topologically 9I"*! ~ S”, this defines a map ó:5" — H. 
If we allow $ to vary over some homotopy class 
of maps, and ó to vary continuously, then ó will 


Figure 4 Lift of a loop. 
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also remain in one homotopy class. Thus, we have 
defined a map 0: ,,1(M) — m,(H):[¢é]> [d]. 

It is also easy to see that the image 
of 90: Tr 11M) — m4(H) is the kernel of i, : 71, (H) — 
Tn[G), because the n-surface in H defined by ó is 
necessarily homotopically trivial in G. Similarly, one 
can see that the image of p. :7,,1(G) — Tn41( M) is 
the kernel of 0: 744(M) — mv,(H). 

Putting all these results together, we see that there 
is a (semi-infinite) exact sequence connecting all the 
homotopy groups: 


This sequence makes it easy to compute most of 
the low-dimensional homotopy groups of M. Let us 
begin with ro(M), which merely labels its discon- 
nected components. As noted earlier, for the Lie 
group G, 0(G) is the quotient group 7(G) = G/Go, 
where Gg is the connected subgroup of G. Now the 
image of mo(H) under i, is clearly the set of 
connected components of G that contain elements 
of H, so if G has m connected components, and z of 
them contain elements of H, then ro(M) has m/n 
elements (see Figure 5). 

Next, we note that, for all the higher homotopy 
groups, disconnected pieces are irrelevant. Since a 
loop, for example, starting at mj must remain 
within its connected component My C.M, it 
follows that  z1(.M) —^71(.Mlo), and similarly 
Tal M) — c4(.Mto) for all n > 1. So one can ignore 
any disconnected parts of the symmetry group G, 
and assume from now on that «9(G) — 0. Moreover, 
it is always possible to replace G by its simply 
connected covering group, replacing SO(3), for 


Figure 5 The disconnected components of G are shaded, those 
of Hare cross-hatched. Here mo( M) has two elements. 


example, by SU(2). Thus, we may also assume that 
711(G) — 0. Then the section of the exact sequence in 
the second line of [9] becomes 


05 (M)3r0(H) 20 


which implies that the two groups in the center are 
isomorphic: 


™(M) = mo(H) [10] 


For example, if the symmetry group G=SO(3) is 
completely broken, so that H = 1, then replacing G by 
G —SU(2) requires replacing H by H={+1, —1} 
^ Za, hence also 71(M)=20(H) = Zo; there is only 
one nontrivial class of linear defects in this model. 

To find «;(.M), we need a standard theorem 
about Lie groups, namely that the second homotopy 
group of any Lie group is trivial: for any 
G,75(G)—0. (No details of the proof are given 
here. It derives from the fact that a generic element 
g € G belongs to a unique one-parameter subgroup 
{exp (ZX),t € R} C G, where X is an element of the 
Lie algebra of G. Thus, all the points on a surface in 
G may be joined by these paths to the identity, and 
the surface may then be shrunk along the resulting 
cone. There are exceptional elements for which 
this is not true, but it can be shown that in a d- 
dimensional group they lie on (d — 3)-dimensional 
surfaces, so any 2-surface can be smoothly deformed 
to avoid them.) 

It follows from this theorem that another section 
of the exact sequence is 


0% r(M) Srm(H >ø 
which again implies an isomorphism: 
m(M) = mı (H) [11] 


For example, if G=SO(3) and H=SO(2), or 
equivalently G=SU(2) and H=U(1) (a double 
cover of the SO(2)), then m(M)=7(H)=Z, so 
point defects in this theory are labeled by an integer 


winding number. 


Examples 


The simplest continuous symmetry is the U(1) phase 
symmetry d+ de of a complex field. In a weakly 
interacting Bose gas, below the Bose-Einstein con- 
densation temperature, or in superfluid helium-4, 
a macroscopic fraction of the atoms occupies a 
single quantum state, and ¢ acquires a nonzero 
expectation value, ($) = @, whose phase is arbitrary, 
so the symmetry is completely broken to H=1. 
Thus, M=S!; we have a circle of equivalent 
degenerate ground states. (This corresponds to 
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spontaneous breaking of the particle-number sym- 
metry. It is possible to describe the system in a U(1)- 
invariant way, by projecting out a state of definite 
particle number, a uniform superposition of all the 
states in M, but it is generally less convenient to do 
so.) In this case, the only nontrivial homotopy group 
Is 71(.M) =Z, so the only defects are linear defects 
classified by a winding number z € Z. The defects 
with 7 — +1 are stable vortices. Those with |n| > 1 
are in general unstable and tend to break up into |n| 
single-quantum vortices. 

Low-temperature superconductors also have a 
U(1) symmetry, although there are important differ- 
ences. This is not a global symmetry but a local, 
gauge symmetry, with coupling to the electromag- 
netic field. Moreover, it is not single atoms that 
condense but Cooper pairs, pairs of electrons of 
equal and opposite momentum and spin. These 
systems too exhibit linear defects, magnetic flux 
tubes carrying a magnetic flux 4rnbh/e. 

A less trivial example is a nematic liquid crystal. 
These materials are composed of rod-shaped mole- 
cules that tend, at low temperatures, to line up 
parallel to one another. The nematic state is 
characterized by a preferred orientation, described 
by a unit vector n, the director. (Note that and —n 
are physically equivalent. There is long-range 
orientational order, with molecules preferentially 
lining up parallel to n, but unlike a solid crystal 
there is no long-range translational order — the 
molecules move freely past each other as in a normal 
liquid. 

A convenient order parameter here is the mean 
mass quadrupole tensor ® of a molecule. In the 
nematic state, ® is proportional to (355 — 1); for 
example, if n=(0,0,1), then ® is diagonal with 
diagonal elements proportional to (—1, —1,2). In 
this case, the symmetry group is SO(3) (or, more 
precisely, O(3); but the inversion symmetry is not 
broken, so we can restrict our attention to the 
connected part of the group). The subgroup H that 
leaves this ® invariant is a semidirect product, 
H —SO(2) x Za (isomorphic to O(2)), composed of 
rotations about the z-axis and rotations through m 
about axes in the x-y plane. (If we enlarge G to its 
simply connected covering group G — SU(2), then H 
becomes H = [U(1) x Z4]/Z2, where U(1) is gener- 
ated as before by J,. The essential difference is that 
the square of any of the elements in the disconnected 
piece of H is not now the identity but the element 
e^": — —1 € U(1).) The manifold M of degenerate 
ground states in this case is the projective space RP? 
(obtained by identifying opposite points of S7). 

Since H has disconnected pieces, we have 


~ 


T1(M) =70(H) = Z2. Thus, there can be topologically 


Figure 6 Orientation of molecules around a disclination line. 


stable linear defects, here called disclination lines, 
around which the director s rotates by 7 (see Figure 6). 
The fact that these defects are classified by Z rather 
than Z means that a line around which z rotates by 27 
is topologically trivial; indeed, m can be smoothly 
rotated near the line to run parallel to it, leaving a 
configuration with no defect. 

There are also point defects; since m2(M)= 
71(H) — Z, they are labeled by an integer winding 
number n. In a defect with m= 1, the vector n points 
radially outwards all round the defect position. 


Helium-3 


Finally, let us turn to helium-3, one of the most 
fascinating and complex examples of spontaneous 
symmetry breaking, which becomes a superfluid at a 
temperature of a few millikelvin. Unlike helium-4, this 
is, of course, a Fermi liquid, so it is not the atoms that 
condense, but bound pairs of atoms, analogous to 
Cooper pairs. In this case, however, the most attractive 
channel is not the !S, but the °P, so the pairs have both 
orbital and spin angular momentum, L = S = 1. There- 
fore, the order parameter is not a single complex scalar 
field but a 3 x 3 complex matrix ©, where the two 
indices label the orbital and spin angular momentum 
states. 

To a good approximation, the system is invariant 
under separate rotations of L and S (the effects of 
the small spin-orbit coupling will be discussed 
later), so the symmetry group is 


G = U(1), x SO(3); x SO(3), [12] 


where the subscripts denote the generators and U(1)y 
represents multiplication by an overall phase factor, 
e^Y dy > ge. This complicated symmetry allows 
much scope for a large variety of defects. There are, in 
fact, two distinct superfluid phases, A and B, with 
different symmetries (and indeed in the presence of a 
magnetic field there is a third, A1). 

In the ?He-A phase, the order parameter has the 
form €; x (mj --injd,, where m, n, d are unit 
vectors, with m | n; if we set [=mAn, then 
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l defines the orbital angular momentum state by 
l-L=1, while d defines the spin quantization axis, 
such that d-S=0. The manifold Ma for this 
phase is 


Ma = [SO(3) x S*]/Z [13] 


where the Z2 is present because (m, n, d) and 
(—m, —n, —d) represent the same state. If, for 
example, we take / and d in the z-direction, the 
unbroken symmetry subgroup is 


H,-SOQ).yx[SOQ) x Za) [14 


where the nontrivial element of Z2 may be taken to 


be e'75«*LJ. The covering group of G is, of course, 
G — Ry x SU(2), x SU(2)s [15] 
Correspondingly, 
Ha = Rey x [U(1)s x Za] [16] 


It follows that the homotopy groups are 
to( Ma) = 0, T(MA) = Za, 72 (.MtA) = £ [17] 


There are linear defects labeled by a mod-4 quantum 
number and point defects labeled by an integer. 

For the *He-B phase, by contrast, the order 
parameter is of the form 


Di x Ripe” [18] 
where R is a rotation matrix, R € SO(3). Here then, 
Mg = SO(3) x S! [19] 

with homotopy groups 


mol Mp) -- 0, 73 (.M p) = Lo x Z, 
mM) — 0 [20] 


In this phase, there are two distinct types of linear 
defect, the mass vortices with an integer label, and 
the spin vortices with a mod-2 label. (One can also 
have a *spin-mass vortex” carrying both quantum 
numbers.) 


Composite Defects 


There are several cases, including in particular 
helium-3, that exhibit symmetry breaking with 
multiple length or energy scales. For example, there 
may be two order parameters, say 4ó,w, with 
lol >> |v|. If |v| is negligible, the symmetry G is 
broken by ¢ to H, and the manifold of degenerate 
ground states is M=G/H. However, these states 
are not all exactly degenerate: 4 breaks the 
symmetry further to K C H, so the precisely degen- 
erate ground states form a submanifold M'=G/K. 


The case of helium-3 is slightly different. Here it 
is the small spin-orbit coupling, arising from long- 
range dipole-dipole interactions, that introduces 
the second scale. Its effect is only significant over 
large distances. 

In the ?He-A phase, at short range the I and d 
vectors are uncorrelated but, over large distances, 
they tend to be aligned parallel or antiparallel. We 
can use the Z2 symmetry mentioned earlier to 
choose 1=d. Hence, the manifold M’, of true 
ground states is only a submanifold of Ma, namely 
M’, — SO(3), whose homotopy groups are 


tol M’) = mi( M’) = ZA. 712 (ML) = [21] 


Because of different behavior on different scales, 
“composite” defects can arise. For example, because 
miMa)=Z, there are short-range monopole con- 
figurations. For the z—1 monopole, we have a 
configuration with uniform l, and with d pointing 
outwards from the center. But, eventually the 
misalignment of d with / is energetically disfavored, 
and at large distances d tends to rotate to align with 
l except around one particular direction where it is 
oppositely aligned (see Figure 7). We have a 
composite defect: a small monopole coupled to a 
relatively fat string. 

To see how the small- and large-scale structures fit 
together, one has to look also at the relative 
homotopy groups 7,(M,M’), whose elements are 
homotopy classes of maps from I” to M such that 
one face of the boundary is mapped into M’, and the 
remainder to the chosen base point mo. For example, 
71(Mt, M) classifies paths that terminate at my while 
beginning at any point of M’. There is, in fact, a 
long exact sequence, similar to [9], relating these 
homotopy groups, of which a typical segment is 

(MI) S m UM) Pm (M, M) 


o 1 


= (M^) LAST [22] 
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Figure 7 Cross-section of a short-range monopole attached to 
a fat string. 
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The relevant groups in the present case are 
T(Ma Mi) = Zo,  ma( Ma, MA) = Z [23] 


Because mı (Ma) = Za, there are three distinct classes 
of linear defects at small scales, but only those with 
quantum number n= 2 (mod 4) survive unchanged to 
large scales; they correspond to the nontrivial element 
of 71(.M',) — Z5. On the other hand, the homotopy 
classes n= +1(mod4) are mapped to nontrivial 
elements of mı( Ma, M’,)= Za, which indicates that 
the corresponding linear defects are coupled at long 
range to fat domain walls, across which d rotates 
through m with a compensating rotation through 7 
about l. Similarly, the nontrivial elements of 
miMaA)=Z are mapped to nontrivial elements of 
75 (MA, M',), confirming that these short-range mono- 
poles are coupled to fat strings, as in Figure 7. 

For ?He-B, the effect of the spin-orbit coupling 
is to make the most energetically favorable 
configurations those in which the rotation 
matrix R in [18] represents a rotation about an 
arbitrary axis n through the Leggett angle 0 = 
arccos(— 1/4) = 104°: R= exp (—10,n - J). 

Consequently, 


M, m8? x 8! [24] 
and so 
To0( Mfg) —0, m(Mp)=Z, mí(M5)=Z [25] 
The relative homotopy groups are 
T1(M p, Mp) = Zo, mÍ(Mp, Mp) = 0 [26] 
Here the mass vortex persists at long range, but the 
configuration around the spin vortex deforms so 


that they become attached to fat domain walls. The 
“monopole” configurations corresponding to 


nontrivial elements of 72(Mj) have no short-range 
singularity at all. 


See also: Abelian Higgs Vortices; Leray-Schauder 
Theory and Mapping Degree; Liquid Crystals; Phase 
Transition Dynamics; Quantum Field Theory: A Brief 
Introduction; Quantum Fields with Topological Defects; 
Solitons and Other Extended Field Configurations; String 
Topology: Homotopy and Geometric Perspectives; 
Symmetries and Conservation Laws; Symmetry Breaking 
in Field Theory; Variational Techniques for 
Ginzburg—Landau Energies. 
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Introduction 


It is well known that large-N Hermitian matrix models 
generate Feynman diagrams which represent the 
triangulation of Riemann surfaces. For instance, if we 
consider the integral of an N x N Hermitian matrix H 


z- [an ex( - Ne een) [1] 


we find that the free energy F= log Z has the 1/N 
expansion 


F= S N?~*8F, (A) [2] 
g=0 


Inspection of the Feynman diagrams shows that F, 
reproduces the sum over the triangulations of genus 
g Riemann surfaces. The theory [1] is obviously well 
defined for A > 0. In the large-N expansion, the 
theory continues to exist also at negative values of 
A down to the critical point A: = —1/12. 

The double scaling limit of large-N matrix 
models (Brézin and Kazakov 1990, Douglas and 


Shenker 1990, Gross and Migdal 1990) is given by 
adjusting the coupling A to A. and at the same time 
taking the limit N — oo. In this limit, contributions of 
all genera survive, and the theory describes the 
dynamics of fluctuating surfaces of arbitrary topolo- 
gies. Results obtained in this way do not, in fact, 
depend on the detailed choice of the potential (#* type 
in [1]) and have a high degree of universality. Thus, it 
provides an interesting model of two-dimensional (2D) 
quantum gravity. 

Soon after the discovery of double scaling limit of 
matrix models, Witten observed that the correlation 
functions of the 2D gravity theory may be given a 
geometrical interpretation as topological invariants 
of the moduli space of Riemann surfaces M, and 
that the 2D gravity theory may be reformulated as a 
topological field theory (Witten 1990). This refor- 
mulation of the results of the 2D gravity theory is 
called “2D topological gravity.” 

In fact, 2D gravity theories come in a family 
parametrized by a pair of integers (p, q). The double 
scaling limit of [1] gives the simplest example 
(p=2,q=1). Models with a chain of p — 1 Hermi- 
tian matrices give the (p, q) 2D gravity theories. The 
label g stands for the order of criticality of the 
model, and higher values of q are achieved by fine- 
tuning the parameters of the potential. At g=1, 2D 
gravity theories possess a topological interpretation. 
The most basic case (p=2,q=1) is called pure 
topological gravity, and in theories at higher values 
of p, topological gravity is coupled to a matter 
system, that is, topological minimal models. Topo- 
logical minimal models are obtained by twisting 
N —2 superconformal field theories. 

Let us first consider the case of pure gravity (p — 2, 1). 
Let O,, denote the observables in the theory and £, the 
coupling constants to these operators. The correlation 
functions of topological gravity are given by 


(O, On IMs je l aii [3] 


where (---}, denotes the expectation value on a 
surface with g handles. The precise significance of 
eqn [3] as the intersection number on the moduli 
space is discussed below. The string partition 
function 7(£) is defined as the generating function 
of all possible correlation functions 


Tf) = exp Y (exp bD TA [4] 


g=0 s 


The most striking aspect of topological gravity is 
the connection of the intersection theory on M to 
the theory of completely integrable systems, that is, 
Korteveg-de Vries (KdV) and KP  hierarchies. 
Witten conjectured that the generating function of 
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intersection numbers on moduli space r(t) is the 
T-function of KdV hierarchy. KdV hierarchy is 
obtained by generalizing the well-known KdV 
equation 


du 3 Ou 1 Oru 

at 2"8x 40x83 
Identification of the KdV equation with topological 
gravity is given by u=2(0,01),x=1t¡,t=t3. 
Witten’s conjecture was verified by Kontsevich 
(1991) by an explicit construction of a new type of 
matrix model which generates the triangulation of 
the moduli space of Riemann surfaces. 

In the general case of (p,1) topological gravity, 
the partition function of the theory obeys the 
equations of pth generalized KdV hierarchy (p 
reduction of KP hierarchy). 


[5] 


Intersection Theory 


We now present some basic features of intersection 
theory on the moduli space of Riemann surfaces. It 
is known that 2D oriented surfaces X with g handles 
and s marked points x; (i = 1,...,s) possess a finite 
number of inequivalent complex structures (complex 
structures are identified when they differ only by 
diffeomorphism). The space of inequivalent complex 
structures is called the moduli space M,, of the 
Riemann surface X. Its dimension is given by 


dim Mgs = 3g -3 +s [6] 


For a mathematically rigorous treatment, we have to 
consider a compactification Mgs of moduli space 
Mgs by adding suitable boundary components 
which arise due to various types of degenerations 
of Riemann surfaces. In the Deligne-Mumford or 
stable compactification, one considers the following 
three classes of singular Riemann surfaces X: 


1. Two points, x; and x;, on X come close together. In 
this case, an extra 2-sphere is pinched off from the 
surface by forming a thin neck. The sphere contains 
points x; and x; and also the point x; at the end of 
the neck (see Figure 1a). Since the original surface 
now has one point less and the 2-sphere with three 
points has no moduli, the degenerate surface has 
3g —4--s parameters and forms a boundary 
divisor of the moduli space Mg... 

2. If a cycle of nontrivial homology class shrinks to 
a point, we have a surface with one less genus 
and two extra marked points. Singular surface 
has 3(g — 1) — 34- s 4- 2 number of moduli and 
this is again a complex codimension-1 compo- 
nent (see Figure 1b). 
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(a) 


(b) 


(c) 
Figure 1 Degenerate Riemann surface obtained when (a) the 
points x; and x; coincide; (b) a nontrivial cycle collapses, two 
new points x; and x; are created; (c) a pinching cycle collapses, 
two new points x; and x; are created. 


3. Similarly, if a dividing cycle pinches, one obtains 
two disconnected surfaces of genus g; with s; + 1 
marked points (i= 1,2;g1 + g2=g,s1 +s2=S). 
This type of degeneration also has the same 
number of parameters $^ (3g; — 3) + © (si + 1) = 
3g — 4 + s (see Figure 1c). 


It is known that Mgs is a compact and smooth 
orbifold space, and observables of topological gravity 
are given by the cohomology classes on Mgs. There 
exist special cohomology classes introduced by 
Mumford and Morita, which are defined as follows: 
There are natural line bundles £i,...,£, on the 
moduli space Mgs. The fiber of the bundle £; at a 
point X € My, is the cotangent space Tí X to the 
point x; on the surface X. These line bundles have the 
first Chern classes c,(£;) and by taking their exterior 
power we can define 2-dimensional classes 


onli) = e(£;)" € H” (Mgs) [7] 


Correlation functions are defined by integrating 
these classes over the moduli space: 


(On, es On.) g = / C] (£4)^ A^ Gills)” [8] 
M 


g.s 


These integrals are topological invariants of Mg, 
and are nonzero only when the degree of the 
cohomology classes adds up to the dimension of 
the moduli space 


> m=3g-3+s [9] 
=] 


gn (1) is known as the nth descendant of the puncture 
operator oo(i), since it is associated with the marked 
point xj. 

The above correlation functions are evaluated 
using various recursion relations. First, one has the 
puncture equation 


(000n,:**0n,)g = (0n, 0n-1:::04,)g [10] 
i=1,n;40 


which can be derived by considering a map 
T: Mg, 41 Mg. where one forgets the position of 
an extra point. Contributions arise when the for- 
gotten point coincides with the other points. This 
relation can be used to eliminate ops from correla- 
tion functions when they are well defined. At g=0, 
less than three insertions are ill-defined and one has 


(090909); = 1 [11] 


Another basic relation is the dilaton equation for 
the operator g1: 


(0194,:::0»), = (28 — 2 + $) (0m >` -Ong — 112] 


The dilaton equation follows from the fact that since 
a, is the first Chern class cı(£), it calculates the 
degree of the canonical line bundle of genus g 
surface with s punctures. At g — 1, one insertion is 
required and one has 


(01)4 =33 [13] 


By combining these recursion relations, one can 
evaluate the correlation functions. For instance, at 
g — 0 one finds 


(m ng) 
nmn 


(0m, ^ Ono = [14] 
A powerful way of computing correlation functions is 
given by the KdV hierarchies and Virasoro conditions 
as discussed below. In the context of integrable 
systems, it is convenient to redefine the observables as 


O17 (2n-1)l-o,, 520 [15] 


Topological Minimal Models 


Standard intersection theory applies to the case of 
pure topological gravity, p=2. At higher values of 
p, the theory is generalized as follows: one intro- 
duces the coupling of topological gravity to the 
topological matter sector which is obtained by 
twisting the A =2 superconformal theories. 

We recall that A/ — 2 superconformal symmetry is 
generated by the operators, stress tensor T(z), U(1) 
current J(z), and two types of supersymmetry 
generators G(z)*. (In the holomorphic sector of the 
theory these operators depend on the holomorphic 
coordinate z of the Riemann surface. In the antiholo- 
morphic sector they depend on the antiholomorphic 
variable z.) Mode expansion of the stress tensor and 
U(1) current is given by 


Tue Lg". KA= Lr" [ie 


L, generates the Virasoro algebra 


¿DEIA T < m(m = 1)6n42,0 [17] 


-— n| — Ww". 


where c denotes the central charge of the theory. 
Commutators of J,, and L, are given by 

[Los ul = 5 Mb mind [AN Do = m [1 8] 

It is known that there is a continuum of unitary 
N=2 conformal theories in the range c> 3; 
however, only discrete values of the central charge 
c=3k/(kR+2),kR=1,2,... are allowed in the 
region 3>c> 1. These are the NV —2 minimal 
models labeled by the level k. Only a finite 
number of primary fields exist in these theories. 

In N =2 theory, primary fields ġa are characterized 
by their conformal dimension and U(1) charge: 


Lolós) — bloa), ^ Jolóo) = lea) [19 


There exists a special set of primary operators, chiral 
primary fields dp (/ — 0,..., k), which are annihilated 
by the supercharge operator G *: 


$ dzG" (z)]r) = 0 20) 


o, has the dimension and U(1) charge 


f 
q(é) - pr. M(6)-1a(). £=0,1,....k [21] 


By considering primary fields annihilated by G^, 
we can also define antichiral fields. Antichiral fields 
have U(1) charge opposite to those of chiral fields. 
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If one defines the twisted stress tensor by 
T'(z) = T(z) + 39J (2) [22] 


then T'(z) has a vanishing central charge. Further- 
more, the conformal dimensions of the supersym- 
metry operators G^ become shifted from 3/2 to 
h(Gt)=1 and h(G~)=2. It is then possible to 
integrate G* on the Riemann surface and define a 
fermionic scalar operator Gj = $ dzG *(z). From the 
N — 2 algebra, one has 

(Gjy 2-0,  (Gj,G (2) =2T'(2) [23 
If we identify Gj as the Becchi-Rouet-Stora-Tyupin 
(BRST) operator of the theory, then the twisted 
stress tensor becomes BRST trivial, which is the 
characteristic feature of topological field theory. 
Thus, we obtain a topological field theory by twisting 
N =2 conformal theory (Eguchi and Yang 1990). 
These are topological minimal models. BRST-invar- 
iant observables are given by the chiral primary fields 
[20]. (To be precise, when we take account of the 
antiholomorphic sector, we may define either O = 
Gg + Ga or O—Gj + Gy as the BRST operator. 
Thus, in general, we obtain two different topological 
field theories. This is the origin of the mirror 
symmetry. In the context of topological gravity, one 
takes the convention Q = G} + Gj.) 

Now, we consider the coupling of topological 
gravity to topological minimal models. We identify 
k — p — 2. Making use of chiral fields $; (¢=0,..., 
p — 2), observables are constructed: 


Op — On C9 Di [24] 


N=2U(1) charge is identified as the degree of 
differential form of the moduli space. Thus, the 
degree of one is n--f/p. Correlation functions 
(IL; -1 05,4), are nonzero if the selection rule 


(nef =(s=Ve-nes Gs 


Fl 


is obeyed. 
We may assemble o,; into operators with one 
index Ó,, as 


H 


Osa = |] (rp + 1): o [26] 
r=0 


where one introduces a convenient normalization 
factor. Note that the operators Om do not exist 
when m = 0 mod p and the corresponding paramters 
t, are absent. This is a characteristic feature of p 
reduced KP hierarchy. 
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The puncture and dilaton equations for (p, 1) 
theories read 


(00,005, ky PTT Op, h;)g 
s 


= y (Ons bi” °° Og e ** On.ke) g [27] 
i= 11,40 


(71,00 ky S Onde 


= (2g—2+ s) CIS n Oy, ks) [28] 
The special terms at g — 0 and g= 1 are given by 


— 1 
(71.0) kl 


n «I 


(00,000,400,5—i-2)9 = 1, 


Integrable Hierarchy 


We now summarize some basic facts about the 
integrable hierarchy (see for instance eqn [5]). We 
introduce a pth order differential operator: 


p-2 
O 
= p 7 d = — 
L — D? 4- is u(x)D', D=Z [30] 


where the coefficient functions u; are arbitrary 
functions of x. This Lax operator describes the pth 
generalized KdV hierarchy. We consider the time 
evolution of the operator L by an infinite set of 
commuting Hamiltonians: 


OL 
= Mi El, 291. eri 31 
am Hel n1 31] 
where H,, is given by 
H, T n/p 
^). m 


Here “+” denotes the non-negative part of a 
pseudodifferential operator and is defined as 


A= Y, fi(x)D', 


i=— 00 


A= fi(x)D' [33] 
i=0 
We also use the notation 
resA = f_1(x), 


A_= Kk» fi (x)D' [34] 


Note that x is identified as the first time variable 7, 
that is, x —A. 


It is a basic result of the calculus of pseudodiffer- 
ential operators that the above Hamiltonians satisfy 
the zero-curvature condition 


OH,, u OH n 
Ot,  Otm 


+ [Hm Hn] = 0 [35] 


Note that when m is a multiple of p, Hm becomes a 
power of L and trivially commutes with L. Thus, the 
time variables £,, are absent for n = 0 mod p. In the 
simple case of p — 2, one has 


L= D* + u(x) [36] 
and H3 = D? + (3/2)uD + (3/4)u'. One finds 


OL Ou 3 Ou 1%u 
e d all = sun +755 [37] 


which is the standard KdV equation. 
In the case of KP hierarchy, one starts with a 
pseudodifferential operator 


O=D+ > ajD^* [38] 
i-1 
and considers the time evolution equations 


JO E 
g = [Hn Ql 


HA, = (Q"), [39] 
p-reduced KP hierarchy is obtained if one has 
OF a0 [40] 


By introducing a pseudodifferential operator K, one 
may bring O to the simple derivative operator D as 


O =KDK” [41] 


K has an expansion of the form 
K=1+) ajD^ [42] 
i=l 


After time evolution, the coefficient functions u;(x) 
of the Lax operator depend also on the variables 
to,t3,... and become functions of f£ = {t),to,...}. 
These functions are expressed by the 7-function 7(f) 
of the hierarchy in the following manner: 


ð 
res K = — x log r(t) [43] 


res L'/? = io log r(t) [44] 


1 

These residues are expressed in terms of (u;) and 
their derivatives in x, and one can determine them in 
terms of the 7-function. 


In the case p=2, one has 
[H,, L] = 2D res(L*/?) k— odd [45] 
Here {R,} are the Gelfand-Dikii potentials 
Ri =H, R3 =1(3u* + u") 
Rs = (10:8 + Su”? + 10uu" + u") (46) 


= DR», 


Il 


and obey the recursion relation 
DRg+2 = (D? + 2(Du + uD))R, [47] 


If one uses the relation [44], Gelfand-Dikii potentials 
are identified as 
Ry = 2(010,) i48] 


By setting k= 1, we note u=2(O,Q,) and find that 
the evolution equations [31] are all satisfied as 


OL Ou O 
Ot, = Ot, = an aod po 2D(O\ Ox) 
= DR; = [H;,L] [49] 


Now it is possible to identify the initial condition 
for the Lax operator in the case of topological (p, 1) 
gravity. By using the definition 


log r(t PICO .) [50] 


g—0 
one has 
res L'/?(0) = (010), i=1,...,p—1 [51] 
From [29] one finds 
resL'P (0) = ix - 6; p-1 [52] 


This gives the initial value of the Lax operator: 
L(0) = D? + px [53] 


Thus, only the lowest term zo(x)— px is nonzero 
and higher coefficients all vanish at t — 0. This is the 
special simplification which takes place in the 
topological gravity theory. 

We note a relation 


5D. L(0) =1 [54] 


This is the so-called “string equation” (at t=0). At 
nonzero values of t, the string equation takes the form 


[P, L] =1 
— .) [55] 


P = : (LUP) , 
p 


- y kt, (L 


k=p+1 
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From [55], we see that (p, 1) theory corresponds to the 
background value of the coupling t),; = —1/(p + 1). 
In the case of (p,q) theory, background value is given 
by tpq+1 = —1/(pq + 1). 


Virasoro Conditions 


A powerful algebraic machinery controlling 
the structure of 2D gravity is the so-called “Virasoro 
conditions.” One introduces differential operators 


+3 3 jtit;  |56| 


i+j= p 


Ó oO 
L= = V^ ky 
Oti e, o -p 


~ og p*-1 
aatis at C [57] 


By using the fact that derivative in t, brings down 
the operator Ó, when acting on the 7-function, it is 
easy to show that 


L. 4-70 [58] 


Lo-T —0 [59] 


reproduce the puncture [27] and dilaton equation 
[28], respectively. It is possible to show that the 
L_,-condition, L 4:7-—0, is equivalent to the 
string equation [55]. 

Together with the operators (7 > 1) 


o? 
| 5 kt 
. S ¿> gw % tk+np TR ji- »» y ôtiðt; 
they generate Virasoro algebra (L7, = (1/p)L,,) 
[,.L;]]—-(m—m)L,,, nm2-1 [60] 


It is possible to show that the (p, 1) model obeys the 


Virasoro conditions [6| 
La t=0, n>-1 [61] 


It is known that (p, 1) models with p > 2 also obey 
constraints of W-algebra. 

The relationship of the Virasoro conditions to 
KdV hierarchy is summarized as 


string equation 4- KdV hierarchy 
<= Virasoro and W-algebra constraints 


Topological c-Model 


It is known that when the target space of a 
supersymmetric nonlinear o-model is a Kahler 
manifold K, the theory acquires an enhanced NV — 2 
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supersymmetry. Then we can twist the theory and 
converted into a topological field theory. This is the 
topological o-model [7]. The partition function of 
the theory consists of a sum over world-sheet 
instantons, that is, holomorphic maps from the 
Riemann surface to the target space K. Due to 
supersymmetry, functional determinants around 
instantons cancel and the theory simply counts the 
number of holomorphic curves inside the Kahler 
manifold K. Thus, the topological o-model has a 
close relationship with enumerative problems in 
algebraic geometry, that is, Gromov- Witten invar- 
iants and quantum cohomology theory. 

When the topological o-model is coupled to topolo- 
gical gravity, the BRST-invariant observables are given 
by on(®;) = o, © Dj, where ®; are cohomology classes 
of K. Correlation functions are defined as 


S 
][o«92) = f. 
1 E M(K:d) 


d 


Tete” ^ e; (®;) [62] 


gs j=l 


Here Mg, s(K; d) denotes the (stable compactification 
of) moduli space of degree d holomorphic maps 
to K from genus g Riemann surfaces X. e? is the 
pullback of the evaluation map e;:(f;xi,..., Xs) € 
Mg,s(K;d) — f(x; € K by f where f is a holo- 
morphic map. Correlation functions [62] give 
topological (symplectic) invariants of the manifold 
K. In the cases n; = 0(i=1,...,s), they are known as 
Gromov- Witten invariants. 

Equation [62] is nonvanishing if the selection rule 


2 (i + q;) = dim Mgs(K: d) 
ii =ci(Kjd+ (3 -dim K)(g—1)+s [63] 


is obeyed, where q; is the degree of cohomology 
class ®; and c¡(K) is the first Chern class of the 
tangent bundle of K. 

We see that there is a close parallel between the 
topological o-model and (p, 1) topological gravity. 
If we formally set q; = Z;/p,ci(K) —- 0, and dim K = 
(p — 2)/p, eqn [63] agrees with eqn [25]. Based on this 
analogy, Eguchi, Hori, and Xiong proposed the 
Virasoro conjecture [8], that is, generating functions 
of the number of holomorphic maps to arbitrary 


Kahler manifolds are annihilated by the Virasoro 
operators which are constructed by taking an analogy 
with those of (p, 1) gravity. The Virasoro conjecture is 
a natural generalization of Witten's conjecture, and 
has recently been rigorously proved in the case of 
curves and projective spaces. 

Excellent reviews on the theory of 2D topological 
gravity are given in Witten (1991) and Dijkgraaf (1991). 


See also: Axiomatic Approach to Topological Quantum 
Field Theory; Large-N and Topological Strings; Mirror 
Symmetry: A Geometric Survey; Moduli Spaces: An 
Introduction; Riemann Surfaces; Topological Sigma 
Models; WDVV Equations and Frobenius Manifolds. 
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Introduction to the Physical and 
Mathematical Contexts and Issues 


One of the most exciting developments of mathema- 
tical physics in the last three decades has been the 
discovery of numerous intimate relationships between 
the topology and the geometry of knot theory and the 
dynamics of many domains of “classical” and “new” 
macroscopic physics. Indeed, complex systems of 
knotted and entangled filamentary structures are 
ubiquitous in nature and arise in such disparate 
contexts as electrodynamics, magnetohydrodynamics, 
fluid dynamics (vortex structures), superfluidity, 
dynamical systems, plasma physics, cosmic string 
theory, chaos of magnetic flows and nonlinear 
phenomena, turbulence, polymer physics, and mole- 
cular biology. In the recent years, mathematical tools 
have been developed to identify and analyze the 
geometrical and topological complex structures and 
behaviors of such systems and relate this information 
to energy levels and stable states. 

The influence of geometry and topology on 
macroscopic physics has been especially fruitful in 
the study and comprehension of the following topics. 


1. Knots and braids in dynamical systems. It is 
now clear that the chaotic behavior of the Hénon- 
Heiles system and other nonlinear systems is driven 
and controlled by topological properties. For example, 
it has been found that trajectories in the phase space 
form hyperbolic knots. The finding of knots in the 
Lorenz equations is another important theme closely 
related to the previous. By varying the Rayleigh 
number r, a parameter in the Lorenz equations, both 
chaotic and periodic behavior is observed. In the recent 
years, the knots (notably several torus knots) corre- 
sponding to the different periodic solutions of the 
system have been found and classified. By finding 
hyperbolic knots and in particular hyperbolic figure-8 
knot as a solution to the Lorenz equations the 
suspicion that there exists a new route to chaos 
would be strengthened. 

2. Topological structures of electromagnetic fields. 
Progress in the field of space physics, astronomy, and 
astrophysics over the last decade, increasingly reveals 
the significance of topological magnetic fields in these 
areas. In particular, the interaction of plasma and 
magnetic field can create an astonishing variety of 
structures, which often exhibit linked and knotted 


forms of magnetic flux. In these complex structures of 
the fields, huge amounts of magnetic energy can be 
stored. It is, however, a typical property of astro- 
physical plasmas, that the dynamics of magnetic fields 
is alternating between an ideal motion, where all forms 
of knottedness and linkage of the field are conserved 
(topology conservation), and a kind of disruption of 
the magnetic structure, the so-called magnetic recon- 
nection. In the latter, the magnetic structure breaks up 
and reconnects, a process often accompanied by 
explosive eruptions, where enormous amounts of 
energy are set free. Magnetic reconnection is in close 
analogy to splitting of knots, which makes us 
confident that the global dynamics of magnetic and 
electromagnetic fields can be characterized with the 
help of such topological quantities as well. 

3. Knotting and unknotting of phase singularities. 
It has long been known that dislocation lines can be 
closed, and recently it was shown that they can be 
knotted and linked. Moreover, Berry and Dennis 
(2001) constructed exact solutions of the Helmhotz 
equation representing torus knots and links; in fact, 
a straightforward application of this idea led to 
knotted and linked dislocation lines in stationary 
states of electrons in hydrogen. As a parameter, 
called a, is varied, the topology of dislocation lines 
can change, leading to the creation of knots and 
links from initially simple dislocation loops, and the 
reverse process of unknotting and unlinking. The 
main purpose here is to elucidate the mechanism of 
these changes of topology. All waves are solutions of 
monochromatic wave equations, that is, stationary 
waves, and a is an external parameter that could be 
manipulated experimentally. However, a could 
represent time, and then the analogous solutions of 
time-dependent wave equations would describe 
knotting and linking events in the history of waves. 
The methods of Berry and Dennis are based on exact 
stationary solutions of wave equations, and lead to 
knots and links threaded by multistranded helices. 


The Origins of Topological Vortex 
Dynamics Ideas 


The intimate relationship between three-dimensional 
vortex dynamics and topology was recognized as early 
as 1869 by W Thomson (Lord Kelvin) who tried to 
elaborate a theory of matter in which atoms were 
thought to be tiny vortex filaments embedded in an 
elastic-like fluid medium, called ether. Accordingly, 
the infinite variety of possible chemical compounds 
was given by the endless family of topological 
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combinations of linked and knotted vortices. Kelvin 
was inspired by the work of Gauss, who in an attempt 
to describe topologically the behavior of two insepar- 
ably closed linked circuits carrying electric current, 
found a relationship between the magnetic action 
induced by the currents and a pure number that 
depends only on the type of link, and not on the 
geometry: this number is the first topological invariant 
now known as the linking number. 

In modern mathematical terms, Gauss introduced 
an invariant of a link consisting of two simple closed 
curves y1, %2 in R?, namely the signed number of turns 
of one of the curves around the other, the linking 
coefficient {y1, %2} of the link. His formula for this is 


N={71, 72} 
= A J naa — 2)/ 


Imt) — y) [1] 


where | , | denotes the vector (or cross) product of 
vectors in R? and ( , ) the Euclidean scalar product. 
Thus, this integral always has an integer value N. If 
we take one of the curves to be the z-axis in R? and 
the other to lie in the (x, y)-plane, then the formula 
[1] gives the net number of turns of the plane curve 
around the z-axis. It is interesting to note that the 
linking coefficient [1] may be zero even though the 
curves are nontrivially linked. Thus, its having 
nonzero value represents only a sufficient condition 
for nontrivial linkage of the loops. This last 
consideration leads naturally to the mathematical 
concepts of knots and links whose most striking 
properties have been investigated in our introduc- 
tory article (see Mathematical Knot Theory). 

The other source of inspiration of Kelvin's theory 
of matter was the Helmholtz's laws of vortex 
motion, which state that in an ideal fluid (where 
there is no viscosity) vortex lives forever: two closed 
vortex rings, once linked, will always be linked. The 
classical results obtained by Helmholtz are basic to 
understanding the dynamics of Euler motions. The 
vorticity of a velocity field is its curl and is denoted 
wlz) :— curl(X(z, t)). In two dimensions, the vorticity 
is a real-valued function and w, = — AW, where Y is 
the stream function of X(z, t). Recall that the push- 
forward of a scalar field. (0-form) s under a 
diffeomorphism f is f.s—-sof !. These results, in 
modern terms, can be stated as follows: 


Theorem (Helmholtz-Kelvin). An incompressible 
fluid motion (M;,¢,) with velocity field X and 
vorticity w, is Euler if and only if its vorticity is 
passively transported, 


Qr Wo = Uy 


and circulation around all smootb simple closed 
curves C are preserved under tbe flow, 


ral X-dr=0 
dt Joc) 


One knows that in three dimensions, the Helmholtz- 
Kelvin theorem says that the vorticity (now a vector 
field) is transported. Thus, with generic initial 
vorticity a 3D time-periodic Euler fluid motion 
preserves a nontrivial vector field. One very interest- 
ing question that remains to be elucidated is the 
following: are there any chaotic, time-periodic Euler 
flows with stationary boundaries? 


The Connection between Topological and 
Numerical Invariants of Knots and the 
Physical Helicity of Vector Fields 


The writhing number of a curve in Euclidean three- 
dimensional space is the standard measure of the 
extent to which the curve wraps and coils around 
itself; it has proved its importance for electrody- 
namics and fluid mechanics in the study of the 
knotted structures of magnetic vortices and 
dynamics flows, and for molecular biologists in the 
study of knotted duplex DNA and the enzymes 
which affect it. The helicity of a divergenceless 
vector field defined on a domain in Euclidean 
3-space, introduced by Woltjer in 1958 in an 
astrophysical context and coined by Moffat in 
1969 in the study of its topological meaning, is the 
standard measure of the extent to which the field 
lines wrap and coil around one another; it plays 
important roles in fluid mechanics, magnetohydro- 
dynamics, and plasma physics. The “Biot-Savart 
operator” associates with each current distribution 
on a given domain the restriction of its magnetic 
field to the domain. When the domain is simply 
connected, the divergence-free fields which are 
tangent to the boundary and which minimize energy 
for given helicity provide models for stable force- 
free magnetic fields in space and laboratory plasmas; 
these fields appear mathematically as the extreme 
eigenfields for an appropriate modification of the 
Biot-Savart operator. Information about these fields 
can be converted into bounds on the writhing 
number of a given piece of DNA. 

Recent researches (Cantarella et al. 2001) 
obtained rough upper bounds for the writhing 
number of a knot or link in terms of its length and 
thickness, and rough upper bounds for the helicity 
of a vector field in terms of its energy and the 
geometry of its domain. It was also showed that in 
the case of classical electrodynamics in vacuum, the 
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natural helicity invariant, called the electromagnetic 
helicity, has an important particle meaning: the 
difference between the numbers of right- and left- 
handed photons. Recently, a topological model of 
classical electrodynamics has been proposed in 
which the helicity is topologically quantized, in a 
relation that connects the wave and particle aspects 
of the fields (Trueba and Rañada 2000). 

Consider two disjoint closed space curves, C and 
C', and the Gauss' integral formula for their linking 
number 


Lk(C, C') = ra [ | (dx /ds x dy/dt) 
x x — y/|x — y| ds dt [2] 


The curves C and C' are assumed to be smooth and 
to be parametrized by arclength. Now the question 
is to know what happens to this integral when the 
two space curves C and C' come together and 
coalesce as one curve C. At first glance, the 
integrand looks like it might blow up along the 
diagonal of C x C', but a careful calculation shows 
that in fact the integrand approaches zero on the 
diagonal, and so the integral converges. Its value is 
the writhing number Wr(C) of C defined above: 


4c 
x x — y/|x — y| ds de [3] 


Wr(C) = + [ _(dx/ds x dy/de) 


Here is the very useful result, due to Fuller (1978). 
The writhing number of a knot K is the average 
linking number of K with its slight perturbations in 

every possible direction: 
Wr(K) = zS Lk(K,K+eW)d(area) [4 

4r Wes? 
This is helpful for getting a quick approximation to 
the writhing number of a knot which almost lies in a 
plane; in the example of a trefoil knot, Wr(K) = 3. 
Here, a very important result must be recalled, a 
“bridge theorem,” proved by Berger and Field 
(1984), see also Ricca and Moffatt (1992), which 
connects helicity of vector fields to writhing of knots 
and links, and which can be used to convert upper 
bounds on helicity into upper bounds on writhing. 


Proposition (Berger and Field). Let K be a smooth 
knot or link in 3-space and Q=N(K,R) a tubular 
neighborhood of radius R about K. Let V be a 
vector field defined in €), orthogonal to the cross- 
sectional disks, with length depending only on 
distance from K. This makes V divergence-free and 
tangent to the boundary of Q. Then the writhing 
number Wr(K) of K and the helicity H(V) of the 
vector field V are related by the formula 


H(V) =Flux(V)* Wr(K) 


In the formula, Flux(V) denotes the flux of V 
through any of the cross-sectional disks D, 


Flux(V) = / V -nd(area) 
D 


where z is a unit normal vector field to D. 

A key feature of this formula is that the helicity of V 
depends on the writhing number of K, but not any 
further on its geometry; in particular, such quantities 
as the curvature and torsion of K do not enter into the 
formula. Berger and Field actually showed that the 
helicity H(V) is a sum of two terms: a “kink helicity,” 
which is given by the right-hand side of the above 
formula, and a *twist helicity," which is easily shown 
in our case to be zero. Their proof assumes K is a knot, 
but it is straightforward to extend it to cover links. 

Let Q be a compact domain in 3-space with 
smooth boundary ðN; we allow both 2 and OO to be 
disconnected. Let V be a smooth vector field (where 
*smooth" means of class C*), defined on the 
domain €). The helicity H(V) of the vector field V 
is defined by the formula 


1 3 
= — Vix)x V x—y/Ix — 
P O (x) x V(y) y/lx — y| 


x d(vol), d(vol), [5] 


H(V 


Clearly, helicity for vector fields is the analog of 
writhing number for knots. Both formulas are 
variants of Gauss' integral formula for the linking 
number of two disjoint closed space curves. 

In order to understand this formula for helicity, 
think of V as a distribution of electric current, and 
use the Biot-Savart law of electrodynamics to 
compute its magnetic field: 


BS(V)(y) V(x) xy — x/|y ^x[l' d(vol), [6] 


— 4n Jo 


Then the helicity of V can be expressed as an integrated 
dot product of V with its magnetic field BS(V): 


_t ee 
z318 V(x) x V(iy):x—»/ly—x| 


x d(vol), d(vol), 
E j V(g): E V(x) xy=x/lx—y|*d(vol), 
x d(vol), 


H(V) 
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Cantarella et al. (2001) found two very interesting 
results. 


Theorem 1 Let K be a smooth knot or link in 
3-space, with length L and with an embedded 
tubular neighborhood of radius R. Then the wri- 
thing number Wr(K) of K is bounded by 


|Wr(K)| < 1/4(L/ R)'? 


For the proof, see Cantarella et al. (2001). 


Theorem 2 Tbe belicity of a unit vector field V 
defined on the compact domain Q is bounded by 


IH(V)| < 1/2 vol(Q)*? 


Let us now give a brief overview of the methods 
used to find sharp upper bounds for the helicity of 
vector fields defined on a given domain € in 
3-space. As usual, € will denote a compact domain 
with smooth boundary in 3-space. Let K(Q) denotes 
the set of all smooth divergence-free vector fields 
defined on (2 and tangent to its boundary. These 
vector fields, sometimes called "fluid knots," are 
prominent for several reasons: (1) They are natural 
vector fields to study in a “fluid dynamics 
approach" to geometric knot theory. (2) They 
correspond to incompressible fluid flows inside a 
fixed container. (3) They are vector fields most often 
studied in plasma physics. (4) For given energy 
(equivalently minimize energy for given helicity), 
they provide models for stable force-free magnetic 
fields in gaseous nebulaes and laboratory plasmas. 
(5) The search for these helicity-maximizing fields 
can be converted to the task of solving a system of 
partial differential equations. (6) The fluid knots can 
reveal some fundamental and still unknown 
mechanisms, which characterize the phenomenon 
of phase transition, and in particular the transition 
from chaotic (unstable) phases and behaviors of 
matter to ordered (stable) ones. 


Knots and Fluid Mechanics (Vortex Lines, 
Magnetic Helicity, and Turbulence) 


The Kelvin's theory of explaining atoms as knotted 
vortices in fluid ether was seminal in the develop- 
ment of topological fluid mechanics. The recent 
revival (starting in the 1970s) is mainly due to the 
work of Moffat, on topological interpretation of 
helicity, and Arnol'd, on asymptotic linking number 
of space-filling curves. Modern developments have 
been influenced by recent progress in the theory of 
knots and links. 


Influence of Geometry and Topology 
on Fluid Flows 


Ideal topological fluid mechanics deals essentially 
with the study of fluid structures that are 
continuously deformed from one configuration to 
another by ambient isotopies. Since the fluid flow 
map is both continuous and invertible, then 
pi (K) and q,(K) generate isotopies of a fluid 
structure K (e.g, a vortex filament) for any 
(41,15) € I. Isotopic flows generate equivalence 
classes of (linked and knotted) fluid structures. In 
the case of (vortex or magnetic) fluid flux tubes, 
fluid actions induce continuous deformations in D. 
One of the simplest deformations is local stretch- 
ing of the tube. From a mathematical viewpoint, 
this deformation corresponds to a time-dependent, 
continuous reparametrization of the tube center- 
line. This reparametrization (via homotopy classes) 
generates ambient isotopies of the flux tube, with 
a continuous deformation of the integral curves. 

Moreover, in the context of the Euler equations, 
the Reidemeister moves (or isotopic plane deforma- 
tions), whose changes conserves the knot topology, 
are performed quite naturally by the action of local 
flows on flux tube strands. If the fluid in (D — K) is 
irrotational, then these fluid flows (with velocity u) 
must satisfy the Dirichlet problem for the Laplacian 
of the stream function y, that is, 


u— Vo in (D-— K) 


[7] 

V~=0 
with normal component of the velocity to the tube 
boundary u, given. Equations [7] admit a unique 
solution in terms of local flows, and these flows are 
interpretable in terms of Reidemeister’s moves 
performed on the tube strands. Note that boundary 
conditions prescribe only u,, whereas no condition 
is imposed on the tangential component of the 
velocity. This is consistent with the fact that 
tangential effects do not alter the topology of the 
physical knot (or link). The three type of Reidemeister’s 
moves are therefore performed by local fluid flows, 
which are solutions to [7], up to arbitrary tangential 
actions. 


Knotted and Linked Tubes of Magnetic Flux 
Let T be the standard solid torus in R? given by 
((2+ecos@)cosy, (2+ecos@)siny, &esin0)) [8] 


where 0 € 0, < 27, and 0 € & « 1. For relatively 
prime integers p and q, let F,, ¿ denote the foliation 
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of T by the curves y- (where O<e<1 and 
0 < 0 < 27) given by 


*y-.e(s) = (2. + € cos(0 + qs)) cos(ps), 
(2 + € cos(0 + qs)) sin(ps), e sin(0 + qs) [9] 


where 0 € s < 27. 


Definition A magnetic tubular link (or magnetic 
link) is a smooth immersion into R? of finitely many 
disjoint standard solid tori U_T; 


LU, TR 


and a smooth magnetic field B on R* such that 


(i) L is an imbedding when restricted to the 
interior of U? ,T;, 
(ii) the bounding surface of UjL(T;) that is, 
U;L(OT;) is a magnetic surface, and 
(iii) for each component LT;, there exist relatively 
prime nonzero integers p; and q; such that L 
maps the foliation Fy, 4, of T; onto the integral 


curves of B in LT;. 


Remark Thus, for every fixed 7 and j, the linking 
number between an arbitrary field line in LT; and 
an arbitrary field line in LT; is the same regardless 
of which integral curves are chosen from LT; and 
LT;, respectively. This is true even when i=j. 


It follows that a magnetic link U;LT; remains a 
magnetic link under the action of the fluid flow, that 
is, U;g; LT; is a magnetic link for t > 0. 

Keeping that the magnetic field B is frozen in the 
fluid, we can now find and study those properties of 
magnetic links that are invariant under the action of 
fluid flow. One obvious invariant is the volume V; 
of each flux tube g,LT;, that is, 


Vs Vol(LT;) e Vol(g;LT) = f J / d(vol) [10 
LT; 


which remains unchanged because of incompressibility. 
Another invariant of fluid flow is defined as 
follows: 


Definition Let L be a magnetic link. For each solid 
torus T;, choose a meridional disk D;. The magnetic 
flux #;=®(LT;) in the ith component is the surface 
integral defined as 


©; = ©(LT);) : ii B - U d(area) 
JLD, 


where U denotes the normal to the surface LD; 
pointing in the positive direction induced by the B 
field. 


It can be shown that 9; is independent of the 
chosen meridional disk. It also can be shown that 
each 6; is a fluid flow invariant, that is, 


Gig LT;)-— "n B - U d(area) [11) 
Sr Lj 


is independent of t. 

One more fluid invariant that will play a central 
role in the energy minimization of magnetic links is 
given by the following definition. 


Definition The helicity of a magnetic link L is 
defined as 


H(L)= // » A - Bd(vol) 


The term helicity was first introduced in a fluid 
context by Moffat, and it was previously used in 
particle physics for the scalar product of the momen- 
tum and spin of a particle. In another connection, note 
that the helicity H(L) is the same as the Chern-Simons 
action: 


H(L)= [ A^ dA 
= [w^ dA --$AAAAA) [12] 


where A now denotes the magnetic vector potential 
as a 1-form. 

It can be shown that H(L) is gauge invariant, and 
hence well defined. 


Theorem (Moffat). 
fluid flow, that is, 


The helicity is invariant under 


d 
4; PG) —0 


Arnol'd (1998) defines the helicity in a more abstract 
setting and shows that it is invariant under the group 
S(Diff) of volume-preserving diffeomorphisms. 

The following theorem summarizes the many 
results due to Moffat, Ricca, Berger, Lomonaco, 
Hornig, Kauffman, and others, relating the helicity 
of magnetic links to linking and to magnetic flux. 


Theorem Let L be a magnetic link. Then 


H(L)= Y  9^SLg--2 Y 61K; 


1<i<n 1<i<j<n 


where SLy, denotes tbe self-linking number of 
the axis curve of the tube LT; with respect to the 
framing F; induced by the integral curves of the 
magnetic field B within LT;, and LK; denotes 
the linking number between any integral curve of 
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the magnetic field B in LT; with any integral curve 
of the magnetic field B in LT;. 


Remark In fact, SLf, is the same as the linking 
number between any two integral curves of the 
magnetic field B within the tube LT;. 


Thus, as many authors have showed, the helicity 
does reflect the topology and the geometry of the 
magnetic lines of force within a magnetic link. If, for 
example, L has only one component, that is, L is a 
magnetic knot, then 


H(L) = 9^ SL;(C) [13] 


where SLr(C) is the self-linking number of the axis 
curve C of the knotted tube with respect to the 
framing F induced by the integral curves of the 
magnetic field B within the magnetic knot. If, for 
example, the tube is knotted in the form of a trefoil 
and if the magnetic lines of force appear to be 
parallel to the axis curve when the trefoil is placed 
on a plane flat surface, then SL — 4-3 and 


H = +36? [14] 


On the other hand, if for example, the magnetic 
lines of force induce the trivial framing in each 
component, then 


H(L)=2 ©,0,LK; [15] 


l<i<j<n 


Thus, if L is a magnetic two-component Hopf link 
with no twisting of the integral curves of the 
magnetic field within the components of L, then 


H(L) =42,®) [16] 


because the self-linking number based on the B-field 
framing is zero for each component, and the linking 
number between the two components is +1. 


Energy of Magnetic Knots and Links 


Let us conclude this section with the definition of 
the energy of a magnetic link. 


Definition The magnetic energy Em(L) of a mag- 
netic link L is defined by the classical formula 


EmM(L)= a lll - IB|^ d(vol) (Gaussian units) 


Although the energy Ey is not flow invariant, it will 
play a central role in magnetic relaxation of knots 
and minimum energy magnetic links. 

Consider a magnetic link L in a perfectly 
conducting, incompressible, viscous fluid. As a result 
of dissipative frictional fluid forces, the magnetic 
energy Ey(g;L) of g;L will decrease with time f£. In 


losing energy, the magnetic lines of force will 
contract. On the other hand, since this is a volume- 
preserving process, the cross sections of the flux 
tubes of g,L will at the same time expand. These 
changes of topology occur while the flux ®, volume 
V, and helicity of g,L will remain the same. In other 
words, knotted magnetic flux tubes left free to 
evolve in such a fluid will do so by conserving their 
magnetic flux ® and volume V, but converting their 
magnetic energy into kinetic energy, which in turn 
dissipates by internal friction. Magnetic links and 
knots evolve from high to low magnetic energy 
levels, conserving topology; and because of the 
induced shortening of field lines under conservation 
of volume, they become fatter and fatter, with an 
increase of the average tube cross section. 

This process cannot continue indefinitely. Even- 
tually, the magnetic flux tubes of g,L must make 
contact with each other. In other words, the topology 
of the magnetic link g,L, as expressed in knotting and 
linking, creates a barrier to the full dissipation of the 
magnetic link's energy, that is, Em(g¿L) has a positive 
lower bound that results from the topology of g;L. 
That means, in other words, that relaxation is 
obstructed by the knottedness and entanglement of 
the field lines, and a minimum magnetic energy is 
reached. Thus, the magnetic link will reach a 
nontrivial stable and invariant energy state, much as 
Kelvin conjectured his atomic vortices would. 

Various estimates of magnetomechanical energy in 
terms of topological quantities have been put forward 
in recent years (see Freedmann and He (1994)). These 
relations give lower bounds for the energy levels 
attainable by knot or link types by taking into account 
the effects that linking numbers and number of 
crossings have on the energy of the relaxed state. 
These bounds are expressed by relationship of the kind 


Ein = Q Cuin; P, V, N) [17] 


where Emin is the equilibrium energy and ó gives the 
relationship between physical quantities — such as 
total flux ®, number of tubes N, magnetic volume V — 
and topology, given here by the minimal possible 
number of crossing Cmin. These relations offer 
numerous advantages due to the explicit dependence 
on qualitative properties of the flow field. A simple 
example is provided by the analysis of three braids, 
which shows that magnetic energy grows quadrati- 
cally in time due to random braiding. This means 
that the least possible amount of magnetic energy 
that can be attained by the physical knot or link is 
determined purely by its topology. If topological 
information sets the levels of minimum energy 
accessible to the knot or link, geometric properties 
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may also influence the relaxation process. Consid- 
erations of helicity and linking numbers, for 
example, demonstrate that internal rearrangement 
of magnetic field geometry leads to a spectrum of 
different asymptotic endstates with the same topol- 
ogy. Moreover, magnetic knots have a natural 
tendency to get rid of excessive torsion of field 
lines and S-shaped tube geometries, and this may 
influence the relaxation process. 

Since the helicity H(g;L) is both an invariant of 
fluid flow and an expression of the magnetic link 
gi L's topology, the following theorem, first stated by 
Moffat, is a mathematical expression of this 
topological bounds. 


Theorem Let L be a magnetic link. Then 
Em(L) > qo|H(L)| 


where qo is a nonzero constant that is independent 
of the magnetic link. 


Freedman and He (1991) obtain more subtle and 
tighter topological bounds on the minimum energy 
of magnetic links. For example, for a magnetic knot 
K, they prove that 


1 (KP? ac(K)?^ 
Em(K) > Ans/A DA -— 
1 $(K)"(2g(K)- 1) 
Ta WKB 


(Gaussian units) [18] 


where V(K) denotes the volume of the magnetic knot 
K, P(K) denotes the flux in K, ac(K) is the asymptotic 
crossing number, and g(K) is the genus of the knot K. 
Freedman and He conjecture that ac(K) =c(K), where 
c(K) is the crossing number, that is, the minimum 
number of crossings among all plane diagrams repre- 
senting the knot K. Besides, Moffat (1990) suggests 
that the minimum energy spectrum of a magnetic knot 
can be used to construct new knot invariants. 


Topological Changes, Dissipation, 
and Reconnection in Fluid Patterns 


As we saw above, topological changes do occur 
when dissipative effects become predominant over 
the coherency of structures. When this happens, 
there is a dramatic change of fluid patterns, often on 
small timescales compared to evolution. The change 
occurs through the formation and disappearance of 
physical reconnections in the fluid pattern. In real 
fluids, for example, vortex and magnetic tubes do 
interact and reconnect freely. From a dynamical 
system viewpoint, reconnections take place when the 


vector field lines (streamlines, vortex lines, or 
magnetic lines) cross each other. If two field lines 
meet, the point of crossing is a true nodal point, like 
a bifurcation in a path. Dissipative effects allow the 
reconnection to proceed through such points. 

In dissipative fluids, mathematical and physical 
properties are no longer conserved, and during the 
process we lose part of the original information. 
However, some of the invariants are rather robust 
and may only degrade slowly. One of them is magnetic 
helicity, the magnetic analog of the kinetic helicity. Its 
dissipation during reconnection can be modest; in 
particular, if the reconnection timescale is small 
compared to classical dissipation times, then helicity 
loss will be negligible. The robustness of magnetic 
helicity plays a central role in fusion plasma physics 
and in many astrophysical contexts. On the other hand, 
large changes in kinetic helicity are intimately related to 
qualitative changes in the topology of vortex flows. 

Under Euler's equations, the helicity of a vortex 
tube of vorticity w and velocity 4 is defined by 
H= [u-wdV. The integral is taken over the tube 
volume V occupied by w. Now, for n knotted and 
linked vortex tubes, each of (constant) strength 
(total vorticity) ®;(1 €; € N), the helicity of the 
whole system can be expressed in terms of linking 
numbers Lk; as 


Fi = $ Lk ®;®;-Lkj 
ij 

which is equal to Lk;;; this is a topological invariant 
whose value does not change under continuous 
deformation of the fluid structure. Since helicity 
and flux-tube strength are measurable conserved 
quantities, the above equation provides useful 
information about the topology of the flow field 
and flow structures. In addition, by direct measure- 
ments of helicity and application of conservation of 
topology, one can estimate average geometric 
quantities, such as the mean twist of field lines, 
and their contribution to the total energy. 


Brief Conclusion 


In this article, we have made an attempt to indicate 
how “classical” field theories, which have been 
successfully used to describe physics of fundamental 
structures and forces of nature, can also be used to 
study geometry and topology of low-dimensional 
manifolds. These developments not only provide new 
insights into old problems of topology of these 
manifolds but also have been responsible for pro- 
foundly interesting new mathematics (fluid 
mechanics, dynamical flows, and polymer biophysics 
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are maybe the most significant examples in the last 
years). In particular fluid dynamics, a topological 
macroscopic field theory, provides a powerful frame- 
work for modern theory of knots and links in 
3-manifolds. Moreover, as we saw here, it provides 
a physical interpretation of the link, self-linking, and 
writhing number of knots and links. The present 
article was essentially aimed to illustrate such a 
relationship. Thus, the most fundamental result we 
reported here is the relation (formula) connecting the 
helicity of vector (magnetic) fields to the writhing 
number of knots: H(V)=Flux(V)* Wr(K). So, wri- 
thing number for knots is the analog of helicity for 
vector fields. Both expressions of these invariants are 
variants of the (Gaussian) integral formula for the 
linking number of two disjoint closed space curves. 
Further investigations of these invariants and their 
mathematical properties might throw new light on 
the interfaces between many different areas of 
macroscopic and quantum physics. 


See also: The Jones Polynomial; Knot Theory and 
Physics; Magnetohydrodynamics; Mathematical Knot 
Theory; Stability of Flows; Superfluids; Topological 
Quantum Field Theory: Overview; Vortex Dynamics; 
Yang-Baxter Equations. 
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new fields of research. A well-known example is the 
prediction of Seiberg-Witten invariants as building 
blocks of Donaldson invariants. However, there are 
others such as the recent proposal for the coeffi- 
cients of the HOMFLY polynomial invariants for 
knots as quantities related to enumerative geometry. 
These developments have drawn the attention of 
mathematicians and physicists into TQFT since the 
1980s, a very fruitful period in which both commu- 
nities have benefited from each other. 

Topology has always been present in mathematical 
physics, in particular when dealing with aspects of 


quantum physics. Global effects play an important 
role in quantum-mechanical models and topology 
becomes an essential ingredient in their description. 
TQFT itself appeared in the winter of 1987 after 
Witten’s work (Witten 1988a) on Donaldson theory 
(Donaldson 1990), but a series of papers during the 
1980s which dealt with topological aspects of field and 
string theory anticipated its existence. Two of these 
correspond to Witten’s works on supersymmetric 
quantum mechanics and supersymmetric sigma mod- 
els (Witten 1982) that led to a generalization of Morse 
theory. This generalization was considered by Floer 
(1987) in a new context that constituted the key 
element in Witten’s construction of TQFT. These 
developments were certainly influenced by Atiyah 
(1988). TQFT was born as a result of the interplay 
between physics and mathematics. This has been a 
constant feature all along its development. 

Soon after the formulation of the TQFT 
addressing Donaldson theory, now known 
as Donaldson—Witten theory, Witten formulated a 
new TQFT which focuses on knot invariants such as 
the Jones polynomial and its generalizations (Jones 
1985). Witten (1989) constructed Chern-Simons 
gauge theory and proved its relation to the theory 
of knot and link invariants. This theory possesses 
different features than Donaldson—Witten theory, 
and in fact it turns out that these two theories fall 
into two different general types of TQFTs as will 
be explained in the following section. Anyhow, 
despite their formal differences, both Donaldson- 
Witten and Chern-Simons gauge theory emerged 
as a novel way to express topological invariants in 
terms of quantum field theory quantities as well as 
to generalize their previous formulation. But there 
was much more to them than it seemed in their 
beginnings. Once these topological invariants were 
formulated in field theory language, one had a 
huge machinery to study them from different 
points of view. Theoretical physicists have devel- 
oped many useful tools to study quantum field 
theory. The use of these tools led to new frame- 
works for these topological invariants. 

In this overview we are going to provide the basics 
of TQFT and briefly describe two examples - 
Donaldson-Witten theory and Chern-Simons gauge 
theory — to explain how the general features are 
implemented. Some excellent reviews on the subject 
(Birmingham et al. 1991, Cordes et al. 1996, 
Labastida and Marino 2004) are available. The 
organization of this work is as follows. In the 
following section we present a general introduction 
to TQFT from a functional integral point of view. 
Next, we touch upon the twisting of extended 
supersymmetry as a general constructive approach 
to TQFT. This is followed by a section on 
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Donaldson-Witten theory where we discuss the 
computation of its observables from a perturbative 
approach, showing their relation to the Donaldson 
invariants. Next, we introduce Chern-Simons gauge 
theory as a theory of knot and link invariants. The 
penultimate section deals with advanced develop- 
ments in TQFT. Finally, we end up with some 
concluding remarks. 


Topological Quantum Field Theory 


We will start our overview by presenting the most 
general structure of a TQFT from a functional 
integral point of view which, though not rigorously 
defined, is the approach that has led to the most 
important developments. As in conventional quan- 
tum field theory, axiomatic approaches to TQFT do 
exist, but we will not follow that route here. 

Let us consider an n-dimensional Riemannian 
manifold X endowed with a metric g,, and a 
quantum field theory on it. We will say that this 
theory is "topological" if there exist operators in the 
theory such that their correlation functions do not 
depend on the metric. If we denote these operators 
by O, (where 7 is a generic label), then 


6 


En (Oi ---O;,) =0 [1] 
where (---) denotes a vacuum expectation value. 
The operators that satisfy this equation are called 
"topological observables." 

The simplest way to achieve metric independence is 
to consider a theory whose action and operators do not 
depend on the metric. In this situation, if no 
anomalous metric dependence is generated upon 
quantization, the correlation functions of these opera- 
tors satisfy [1] and lead to topological invariants on X. 
Theories of this sort are collectively referred to as 
Schwarz-type TQFTs, and well-known examples are 
Chern-Simons gauge theory and BF theories. How- 
ever, Schwarz-type theories are too restrictive. One 
would like to have a theory satisfying property [1] with 
a weaker condition on the action. This can be achieved 
with the help of a symmetry. The resulting TQFTs are 
called of Witten or cohomological type, the main 
examples being Donaldson-Witten theory and topo- 
logical sigma models (Witten 1988b). 

For TQFTs of Witten type, the action may depend 
on the metric. However, the theory has an underlying 
scalar symmetry 6 acting on the fields @;. Since 6 is a 
symmetry, the action of the theory satisfies 6S(6;) = 0. 
In these theories, metric independence of the correla- 
tion functions is achieved as follows. Let T,= 
(6/6g"")S(ó; be the energy-momentum tensor of 
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the theory. It turns out that the energy-momentum 
tensor is 6-exact: 


Tuy = —i6G yy [2] 


G yw being some tensor. Indeed, if [2] is satisfied, it 
follows that for any set of operators O; which are 
6-invariant, 


6 
js (Onn +- Op) = (0,0, OT.) 
= -i(0,0,, ---O,6G,) 
= £1(6(O;,O};, -** Oj, G,,)) 
=0) i3] 
In this computation we have assumed that 


the symmetry 6 is not anomalous and that there are 
no contributions coming from boundary terms since 
we have integrated by parts in field space. This is not 
always the case and in fact the situations in which one 
of these two properties fails lead to rich phenomena. In 
those cases, for example, in Donaldson—Witten theory 
on manifolds with b5 = 1, the correlation functions fail 
to be topological invariants in a controlled manner 
which unveils many interesting properties. 

We will now describe Witten-type theories in a 
general context. The general structure of Schwarz-type 
theories is much simpler and will be illustrated in 
the example presented below. In Witten-type theories 
the observables are the 6-invariant operators. It is 
simple to prove that ó-exact operators decouple from 
the theory. Indeed, if O, is 6-exact, O, = 6O,, then 


(0,0; O; ---O;,) = (60,0, 0; ---O;,) 
-(6(0,0,0,-..0,) —0 A 
Thus, one can restrict the set of observables to the 
cohomology of 6: 
Ker ó 
Im ó 
There is no reason a priori why the ó-symmetry 
should be a scalar Grassmannian symmetry, but in 
all known models of Witten-type TQFTs this turns 
out to be the case. Thus, these theories violate the 


spin-statistics theorem. In all these models the 
algebra of the 6 symmetry has the form 


ez [6] 


Oc [5] 


where Z is a symmetry transformation (typically a 
gauge symmetry of some sort). This property forces 
to consider Z-invariant observables and to work in 
the context of *equivariant cohomology." 

The observables of Witten-type theories fit into a 
general pattern that we describe now. The key 
ingredient is a map between the homology of X and 


the equivariant cohomology of ó. Given an operator 
o in the equivariant cohomology of 6, let us 
consider the following set of equations: 


do” = 6A"), 550 [7] 


where the operators $9? (n — 1,...,dim X) are diff- 
erential forms of degree n on X and d is the de Rham 
differential. These differential equations are called 
“descent equations" and their solutions ¢™ (n > 0) 
“topological descendants" of 4%. We will show how 
to construct a solution to these equations on general 
grounds. 

The topological descendants lead to the construc- 
tion of a set of elements of the equivariant coho- 
mology of 6. Let y, be an n-cycle on X, yn € H,(X), 
and let us consider the following operator: 


W (Yn) - o? [8] 


po) 
Yn 
This operator is 6-invariant, 


Wyo) = / 69 = | do) = 
Yn 


Yn ð Yn 


p00 pi 


since Oy, — 0. On the other hand, if y, were trivial 
in homology, that is, if Yn =ð Yn+1, we would have 
that ws is 6-exact: 

w= f =f demas 


O Ynt Yn+1 n4 


oo [10] 


Thus, given the operator à, we have constructed a 
map between the homology of X and the equivar- 
iant cohomology of 6. There are as many maps as 
basic operators $ one finds in the theory. 

To actually construct these maps, we need to find 
a solution of the descent equations [7]. As 
announced before, there is a general solution to 
those equations in Witten-type theories. Since in this 
type of theories [2] holds, there exists an operator 


Ey == Go, [11] 
that satisfies 
Pa = Ig, = dG y [12] 


Notice that G, is an anticommuting operator and a 
1-form in spacetime. With the aid of this operator, 
one constructs the following solution to the descent 
equations [7]: 


1 
(71) dx! Az. A dx!" [13] 


oU) — 
y! Palm 
where 


- 2 (x) ES em Gy; de Gu, O (x), 
n — 1,...,dim X [14] 


One can easily check using [12] and the 6-invariance 
of y that the operators [13] do satisfy the descent 
equations [7]. 

We have seen that Witten-type TQFTSs are char- 
acterized by property [2]. It would be desirable to have 
at hand a systematic procedure to build theories 
satisfying that property. It has been found that 
extended supersymmetry provides a very helpful 
starting point to build those theories. Although super- 
symmetry guarantees from first principles only the 
weaker condition [12] instead of [2], all TQFTs that 
have been constructed from extended supersymmetry 
actually satisfy [2]. To build a TQFT from a theory 
with extended supersymmetry, one needs to go 
through the twisting procedure that we now describe. 


Twisting of Extended Supersymmetry 


All known Witten-type theories are related to an 
underlying extended supersymmetric quantum field 
theory. The topological theory is a modified version of 
the supersymmetric theory in which the Lorentz 
transformation properties (spins) of some of the fields 
have been modified. This modification of spin assign- 
ments is known as twisting, and it can be carried out 
on any theory with extended supersymmetry in any 
spacetime dimension. We will not consider the 
procedure in such a general setting but instead we 
will illustrate it by considering the case of V —2 
supersymmetry in four dimensions. We will begin with 
a general description and then we will apply it to a 
specific example: Donaldson-Witten theory. 

Let us consider the Euclidean version of the M — 2 
supersymmetry algebra with no central charges. Central 
charges can be included without much ado but we will 
not consider them for simplicity. The total symmetry 
group of the theory is H = SU(2), x SU(2)_ x SU(2)p x 
U(1)g, K — SU(2), x SU(2)_ being the rotation group, 
and SU(2)p x U(1)p the internal symmetry group of 
the A —2 supersymmetry algebra. The generator 
algebra takes the following form: 


{Qav; Qiu} = 2611007, ¿P yo [Qav, Og] = 0 

[Pa Qa] = 0, [Pus Qae] = 0 

[Mass Osv] = €s(a Q ay; [Mas Qa =0 u 
Mis. Qu] = 0, Maó Qiu] = €i. Q y, 
BY’, Q7] = e"" Qj", BY, Ox] = —e"" o?" 
Quo, R] = Qow, [Qan R] = Qin 


[15] 


In these relations v,w € {1,2} are SU(2)p indices and 
a and & denote spinorial indices of SU(2) and 


SU(2),, respectively. The supersymmetry generators 
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Qay and Os, transform under H as (0,2, 2)! and 
mue respectively. M, and Mag are the 
generators of SU(2), and SU(2). , respectively, while 
B"" and R generate SU(2)p and U(1)p, respectively. 
The twisting of a supersymmetric theory involves a 
modification of the couplings of the theory to a 
background metric on the space where the theory is 
defined. This modification is carried out redefining 
the Lorentz transformation properties of the different 
fields making use of the internal symmetry SU(2)z. In 
particular, we will redefine the couplings of the fields 
to the SU(2), spin connection according to the way 
they transform under SU(2)p. This is easily done by 
identifying the SU(2), indices v with the SU(2), 
indices à. The procedure involves a redefinition of 
the rotation group into K’ — SU'(2), & SU(2). , where 
SU'(2), is generated by 
M. = Mag — Bag [16] 
The supersymmetry generators Qay and O,, get 
transformed in the following way: 


Qa, OQ 11 7] 
Oo ia Vas 
which allows us to define the “topological 
supercharge”: 
Q = ef Osa [1 8] 


It is simple to prove using [15] and [16] that this 
quantity is a scalar under the new rotation group 
K':[Mas, Q] =0 and [M' ,, OQ] — 0. In addition, from 
[15], it follows that Q is nilpotent (in the absence of 
central charges): 


Q =0 119] 


The scalar generator O leads to the topological 
symmetry 6 of the previous section. Actually, the 
twisting procedure provides also the operator G, in 
[12]. Defining 


1 ay 
Gy = 4 (Tu) TOsa [20] 
one easily finds, after using [15] and [18], 
{Q, Gu} = ô, [21] 


which is indeed equivalent to [12]. On general 
grounds we cannot prove that twisted supersym- 
metric theories lead to theories which satisfy [12]. 
However relation [12], which is weaker, is guaran- 
teed. It turns out that in all the models originated 
from extended supersymmetry which have been 
studied, [2] is satisfied and thus the resulting 
theories are TQFTs of Witten type. 
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Donaldson-Witten Theory 


One of the greatest successes of TQFT has been the 
discovery of Seiberg-Witten invariants as building 
blocks of Donaldson invariants. This was achieved 
in two main steps. First, Donaldson theory was 
reformulated in field-theoretical terms, using pertur- 
bative methods. Second, the resulting TQFT was 
solved using nonperturbative methods. In this sec- 
tion we are going to describe in some detail the first 
step. The second one will be briefly addressed later 
and is the main object of a separate article in the 
encyclopedia (see Seiberg- Witten Theory). 

Let us consider VV = 2 supersymmetric Yang-Mills 
theory in four dimensions. The field content of the 
theory is the following: a gauge field A,,, two spinors 
Ava, and a complex scalar $, all of them in the 
adjoint representation of a gauge group G. In 
addition, the theory possesses the auxiliary fields 
Dy, in the 3 of the internal SU(2)p. The theory has 
the following action: 


— 1 
J d'x tr (vovv m IA o" VA — 4 PEU 
1 1 " ; 
—DuaD NER. : ATA E m ggg A ya " 
i ——H —WwWÓ 
= EvwA (à A " 2 


This action is invariant under the following N —2 
supersymmetric transformations: 


66 — vV2e""£,A,, 
6A, =i, A —iAyo, E. 
Sdra Dy Eus — pale, dT] — io" dP Eye Fu 
+iv2 ZEn we V iP 
=2i6"G"V 


[23] 


yr” HRS, 
+ 2iv2e v^, ol] + 2i 22" [X^ gj 


óD Ut 


€, being spinorial N = 2 supersymmetric parameters. 

We can now twist the above theory following the 
procedure explained in the previous section. Upon 
twisting, the fields of the theory change their spin 
content as follows: 


A020)" ADEL 23^ 
Àa,(2,0,2)! — 4b, 4(2,2)' 
Xav(0, 2,2)" — 5(0,0) *,x,,(3,0) 
(0,0,0)? — (0, 0)* 
9^(0,0,0) ^ — (0,0) ^ 
D,,,(0,0,3)" — D¿¿(0, 0)" 


|24] 


In this table the representations of the respective 
rotation groups carried by the fields have been 
indicated. The superindices refer to the U(1)p charge 
which is also called *ghost number" in the context 
of TQFT. The fields y and x are given by the 
antisymmetric and symmetric pieces of À;;:X,5— 
Neg and n— (1/2)e?2A ,;. 

Notice that the twisted fields in [24] are differ- 
ential forms on X; therefore, the twisted theory 
makes sense globally on any arbitrary Riemannian 
4-manifold. This is not the case with the original 
N =2 supersymmetric Yang-Mills, which contains 
fermionic fields. Making global sense of those on 
arbitrary Riemannian 4-manifolds requires the 
manifold to be Spin. 

The dynamics of the twisted theory is governed by 
an action which can be obtained by twisting the 
action [22]. On an arbitrary Riemannian 4-manifold 
endowed with a metric g,,, the twisted action 
becomes 


f= J dix /gtr (vv ih seh tV 


= m = iT Jj ¿DD 
; lé el - mv x“ 5, Xag 
+iv2n[0, | — E Vad er, " e) [25] 


where yg = (det(g;.) nae. 

To obtain the transformations of the fields under 
the topological symmetry, we need to compute the 
Q-transformations. These are easily obtained using 


[18] and [23]. They turn out to be 


[Q, 9] = 0 

[Q, Ay. = V, 

(Q.n) = [6.9] 

{Q, Yu} = 2V2V pp i26] 
[Q, 9] = 2V2in 

{Q, Xagh = i(F} ag - Dag) 

[9, D] = (2Vw)* + 2V2/¢, x] 


where p, = g ee and Ft, = øf F v is the self-dual 
part of F,,,. Using these transformations, one easily 
finds that Q' is a gauge transformation. This is not 
unexpected since the M =2 supersymmetric trans- 
formations [23] are in the Wess-Zumino gauge and 
they close only up to gauge transformations. This 
property implies that one must consider the equiv- 
ariant cohomology of O defined on the set of gauge- 
invariant operators. 


The action [25] is Q-exact up to a topological 
term: 


$- (Qv) - 5 [PAF [27] 


where 
V = / dx /g tr E x, (F^ de D?) 


t i EN" waa A41 
72 nid, Ó | tr 25 ae Q ) [28] 


Actually, it turns out that in all the theories obtained 
after twisting extended supersymmetry, the resulting 
actions are Q-exact up to topological terms. In the 
case of N —2 theories, topological (theta) terms 
| F ^F are generically not observable (due to a chiral 
anomaly), so it is customary to pick 


Spw = (Q, V) [29] 


as the action of the theory, which immediately implies 
[2] and therefore the topological character of the 
theory. Notice, however, that [29] is stronger than [2]. 

As we described in the previous section, the 
observables of the theory can be constructed using 
the operator G,, in [20]. Its action on the twisted 
fields is easily obtained using [23]: 


1 
PN / = —— Y, 
[ I o) 2/2"! 


i ; 
[G_, A] = 2 Sm" — 1X pw 
iv2 


[G, 1] = --7 Vó 
{Gu V} = (Epa + Di) [30] 
IG, 6] — 0 


[G, F*] =iVx + > * Vn 
3iv2 


[G, D] = + Unt Vx 


We now need to fix the basic operator o in [14]. 
The starting point must be a set of gauge-invariant, 
O-closed operators which are not O-trivial. Since 
[0,6] =0, these operators are the gauge-invariant 
polynomials in the field ¢. For a simple gauge group 
of rank r the algebra of these polynomials is 
generated by r elements, and we shall denote this 
basis by O,,7 — 1,...,r. A simple choice for SU(N) 
consists of the following Casimirs: 


O, ir. n=1l,...,N [31] 
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Using G, we can now construct the map between 
the homology of X and the equivariant cohomology 
of Q. Let us consider the simple case SU(2). There 
exists only one independent Casimir and, corre- 
spondingly, only one basic operator: 


O = tr(¢*) [32] 


for which one finds the following set of descendants: 


QU eig e v, ) dx" 


1 1 
O?) = tf (= OF, d Dv) 
2 i2 e [33] 


E dx" ^ dx" 


The map from the homology of X to the equivariant 
cohomology of Q can now be constructed very 
easily. Let y; be an element of the homology group 
H;(X). We associate to it the following observable: 


y Ti(n)= | o" [34] 
yi 
where O” is given in [33]. The construction assures 
that J;(4;) is invariant under Q and gauge transfor- 
mations. Furthermore, it is also assured that I;(»;) is 
not Q-exact. 

Let us consider the computation of correlation 
functions. The discussion will be presented for a 
generic gauge group. We will consider the topologi- 
cal theory defined by the Donaldson- Witten action 


Spw = (Q, V} [35] 


where V is defined in [28]. The property [35] has a 
very important consequence. The action Spy shows 
up in the correlation functions as exp(—Spw/e?), 
where e is a free parameter which corresponds to 
the coupling constant of the V — 2 theory. Since the 
term involving the coupling constant is Q-exact, the 
correlation functions of Q-invariant operators are 
independent of e. Let us explain this in some detail. 
The (unnormalized) correlation functions of the 
theory are defined by 


(dy - Pn) — Jos, — dn e (1/e*)Spw [36] 


where $1,...,ó, are invariant under Q transforma- 
tions. Using the fact that Spy is Q-exact, one obtains 


o 2 
3; 1 s On) = a ur - OnSpw) 
-2iq(Q--6vp-o 7 
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where we have used the fact that Q is a symmetry of 
the theory, and therefore as in [3] the last functional 
integral gives zero. This result implies that one can 
compute these correlation functions in different 
limits of e. In the weak-coupling limit (semiclassical 
or saddle point approximation), one establishes the 
connection with Donaldson theory. In the strong- 
coupling limit, Seiberg-Witten invariants appear and 
one finds the connection between these two types of 
invariants. We will briefly explore the weak- 
coupling limit e — 0. The functional integral [36] 
can be evaluated exactly in two steps: first. one 
analyzes the zero modes or classical configurations 
that minimize the action, then one expands around 
them considering only quadratic fluctuations. The 
integration over these quadratic fluctuations 
involves ratios of determinants of kinetic operators 
that because of the Q-symmetry of the theory (which 
in fact is a Bose-Fermi symmetry) are +1. One is 
then left with an integral over the bosonic zero 
modes which leads to a finite-dimensional integral 
over the space of bosonic collective coordinates, and 
a finite Grassmannian integral over the zero modes 
of the fermionic fields. A careful analysis of the zero 
modes, first carried out by Witten, reveals that the 
infinite-dimensional functional integral is replaced 
by a finite-dimensional integral over the moduli 
space of anti-self-dual (ASD) connections Masp, 
that is, the space of connections satisfying F}, — 0. 

Therefore, the correlation functions [36] have the 
form 


idn) = [ by ASA; [38] 
Masp 


where the fields in @---@, are mapped to differ- 
ential forms 6$ -- -On on Masp — the degree of each 
form being given by the ghost number of its 
partner. Notice that the integral on the right-hand 
side vanishes unless the form has top degree. From 
the field-theoretical point of view, this is the 
requirement that the overall ghost number of the 
correlation function must be equal to dim Masp. 
The quantities on the right-hand side of [38] are - 
for gauge group SU(2) — precisely the Donaldson 
invariants. Thus, Witten's work provided a new 
point of view on these invariants by reformulating 
them in a quantum field theory language. This is a 
very important contribution since quantum field 
theory is a very rich framework and a wide variety 
of methods can be used to analyze the correlation 
functions. This opened an entirely new strategy to 
investigate the Donaldson invariants. The emergence 
of Seiberg-Witten invariants is perhaps the greatest 
achievement of the implementation of this strategy. 


We finish this section by pointing out that many 
features of the evaluation of the functional integral 
of the Donaldson- Witten theory developed here are 
common to most topological field theories of the 
Witten type. These features can be studied in the 
context of the Mathai-Quillen formalism which is 
the object of a separate article in the encyclopedia 
(see Mathai-Quillen Formalism). 


Chern-Simons Gauge Theory 
for Knots and Links 


Chern-Simons gauge theory is the most important 
example of Schwarz-type TQFTs. Let us begin by 
introducing its basic elements. Chern-Simons gauge 
theory is a quantum field theory whose action is 
based on the Chern-Simons form associated to a 
nonabelian gauge group. The theory is defined by 
the following data: a smooth 3-manifold M which 
will be taken to be compact, a gauge group G which 
will be taken semisimple and compact, and an 
integer parameter k. The action of the theory is 


Scs(A) = x. f [anda e An AMA) [39] 


where A is a gauge connection and the trace is taken 
in the fundamental representation. The exponential 
of i times this action is invariant under gauge 
transformations, 


A—A-g'!dg [40] 


where g is a map g: M — G. 

Notice that the action [39] is independent of the 
metric on the 3-manifold M. In this theory, appro- 
priate observables lead to correlation functions 
which correspond to topological invariants. Candi- 
dates to be observables of this type must be metric 
independent and gauge invariant. Wilson loops 
satisfy these properties. They correspond to the 
holonomy of the gauge connection A along a loop. 
Given a representation R of the gauge group G and 
a 1-cycle y on M, it is defined as 


W* (A) = trr(Hol,(A)) = trp Pexp | A [41] 


Products of these operators are the natural candi- 
dates to obtain topological invariants after comput- 
ing their correlation functions. These correlation 
functions are formally written as 


(WEI WR ... WE») 


= | paws (A) Wk (A) on We Ata [42] 


where *1,752,...,7 are 1-cycles on M and R,, Ro, 
and R, are representations of G. In [42], the 
quantity [DA] denotes the functional integral mea- 
sure and it is assumed that an integration over 
connections modulo gauge transformations is car- 
ried out. As usual in quantum field theory, this 
integration is not well defined. Field theorists have 
developed methods to assign a meaning to the right- 
hand side of [42]. These methods mainly fall into 
two categories — perturbative and nonperturbative — 
and their degree of success mostly depends on the 
quantum field theory under consideration. For gauge 
theories, it is also possible to take an alternative 
approach, the large-N expansion, which in general 
provides further insights into the theory. In Chern- 
Simons gauge theory all these three methods have 
proved of great value. 

Witten (1989) showed, using nonperturbative 
methods, that when one considers nonintersecting 
cycles %1, Y2,..., Yn without self-intersections, the 
correlation functions [42] lead to the polynomial 
invariants of knot theory discovered a few years 
earlier starting with the work of Jones (1985). 

Knot theory studies embeddings 7:S' — M. Any 
two of such embeddings are considered equivalent if 
the image of one of them can be deformed into the 
image of the other by a homeomorphism on M. The 
main goal of knot theory is to classify the resulting 
equivalence classes. Each of these classes is a knot. 
Most of the work on knot theory has been carried 
out for the simple case M = $?. Chern-Simons gauge 
theory, however, being a formulation intrinsically 
three dimensional, provides a framework to study the 
case of more general 3-manifolds M. 

A powerful approach to classify knots is based on 
the construction of knot invariants. These are 
quantities which can be computed for a representa- 
tive of a class and are invariant within the class, that 
is, under continuous deformations of the chosen 
representative. At present, it is not known if there 
exist enough knot invariants to classify knots. 
Vassiliev invariants (Vassiliev 1990) are the most 
promising candidates, but it is already known that if 
they do provide such a classification, infinitely many 
of them are needed. 

The problem of the classification of knots in $? 
can be reformulated in a two-dimensional frame- 
work using regular knot projections. Given a 
representative of a knot in $), deform it continuously 
in such a way that the projection on a plane has 
simple crossings. Draw the projection on the plane, 
and at each crossing use the convention that the line 
that goes under the crossing is erased in a neighbor- 
hood of the crossing. The resulting diagram is a set 
of segments on the plane, containing the relevant 
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information at the crossings. The problem of 
classifying knots is equivalent to the problem 
of classifying knot projections modulo a series of 
relations among them. These relations are known as 
Reidemeister moves. Invariance of a quantity under 
the three Reidemeister moves is called invariance 
under ambient isotopy. If a quantity is invariant 
under all but the first move, it is said to possess 
invariance under regular isotopy. 

The formalism described for knots generalizes to 
the case of links. For a link of n components, one 
considers n embeddings, »y;:$! — M (i=1,...,7), 
with no intersections among them. Again, the main 
problem that link theory faces is the problem of 
their classification modulo homeomorphisms on M. 
In this case one can also define regular projections 
and reformulate the problem in terms of their 
classification modulo the Reidemeister moves. 

The study of knot and link invariants experimen- 
ted important progress in the 1980s. Jones (1985) 
discovered a new invariant which carries his name. 
The Jones polynomial can be defined very simply in 
terms of skein relations. These are a set of rules that 
can be applied to the diagram of a regular knot 
projection to construct the polynomial invariant. 
They establish a relation between the invariants 
associated to three links which only differ in a 
region as shown in Figure 1 where arrows have been 
introduced to take into account that the Jones 
polynomial is defined for oriented links. 

If one denotes by V,(t) the Jones polynomial 
corresponding to a link L, t being the argument of 
the polynomial, it must satisfy the skein relation: 


1 1 
— Vr, = EVL = t——-J)V 43 
uomo -(vi-—)v — Hal 


where L,,L , and Lo are the links shown in 
Figure 1. This relation plus a choice of normali- 
zation for the unknot (U) are enough to compute the 
Jones polynomial for any link. The standard choice 
for the unknot is 


Vu —1 [44] 


though it is not the most natural one from the point 
of view of Chern-Simons gauge theory. After Jones 


A X X 


Figure 1 Skein relations. 
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work in 1984, many other polynomial invariants 
were discovered, as the HOMFLY and the 
Kauffman polynomial invariants. 

The pioneering work of Witten in 1988 showed 
that the correlation functions of products of Wilson 
loops [42] correspond to the Jones polynomial when 
one considers SU(2) as gauge group and all the 
Wilson loops entering in the correlation function are 
taken in the fundamental representation F. For 
example, if one considers a knot K, Witten showed 
that 


Vk(t) = (We) 45] 


provided that one performs the identification 
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where h = 2 is the dual Coxeter number of the gauge 
group SU(2). Witten also showed that if instead of 
SU(2) one considers SU(N) and the Wilson loop 
carries the fundamental representation, the resulting 
invariant is the HOMFLY polynomial. The second 
variable of this polynomial originates in this context 
from the N dependence. However, these cases are 
just a sample of the general framework intrinsic to 
Chern-Simons gauge theory. Taking other groups 
and other representations, one possesses an enormous 
set of knot and link invariants. These invariants 
can also be obtained in the context of quantum 
groups. 

Many nonperturbative studies of Chern-Simons 
gauge theory have been carried out. The quantiza- 
tion of the theory has been studied from the point of 
view of the operator formalism as well as other 
more geometrical methods. Also, its connection to 
two-dimensional conformal field theory has been 
further elucidated, and a powerful method for the 
general computation of knot and link invariants has 
been developed by Kaul and collaborators. 

Chern-Simons theory is also amenable to pertur- 
bative analysis, which has provided important 
representations of the Vassiliev invariants. These 
invariants, proposed by Vassiliev in 1990, turned 
out to be the coefficients of the perturbative series 
expansion of the correlators of Chern-Simons gauge 
theory. Perturbative studies can be carried out in 
different gauges, originating a variety of new 
representations of Vassiliev invariants. Among the 
most relevant results related to these topics are the 
integral expressions for Vassiliev invariants by 
Kontsevich and by Bott and Taubes, as well as the 
recent combinatorial ones. These developments are 
not described here but the interested reader is 
referred to the recent review (Labastida 1999). 


Advanced Developments 


Topological sigma models are another important 
type of (Witten-type) TQFTs. These theories are 
obtained after twisting 2D MN =2 supersymmetric 
sigma models. The twisting can be done in two 
different ways leading to two types of models, A and 
B. Their existence is related to mirror symmetry. 
Only type-A models will be described in what 
follows. These models can be defined on an 
arbitrary almost-complex manifold, though typically 
they are considered on Kahler manifolds. The theory 
involves maps from two-dimensional Riemann sur- 
faces X to target spaces X, together with fermionic 
degrees of freedom on X which are mapped to 
tangent vectors on X. The functional integral of the 
resulting theory is localized on holomorphic maps, 
defining the corresponding moduli space. The 
corresponding Q-cohomology provides the set of 
physical observables, which can be mapped to 
cohomology classes on the moduli space and 
integrated to produce topological invariants. 

Topological sigma models keep fixed the com- 
plex structure of the Riemann surface X. Moti- 
vated by string theory, one also considers the 
situation in which one integrates over complex 
structures. In this case, one ends up working with 
holomorphic maps in the entire moduli space of 
curves. The resulting theories are called topologi- 
cal strings. 

We will review now a particular example of 
topological string theory which, besides being very 
interesting from the point of view of physics and 
mathematics, will be very useful in establishing a 
relation with Chern-Simons gauge theory. Let us 
consider topological strings with target manifold X 
a Calabi-Yau 3-fold. In this case, the virtual 
dimension of the moduli space of holomorphic 
maps turns out to be zero. Two situations can 
occur: either the space is given by a number of 
points (the real dimension is zero) or the moduli 
space is finite dimensional and possesses a bundle of 
the same dimension as the tangent bundle. In the 
first case, topological strings count the number of 
points weighted by the exponential of the area of the 
holomorphic map (the pullback of the Kahler form 
integrated over the surface) times x?*?, where x is 
the string-coupling constant and g is the genus of X. 
In the second case, one computes the top Chern class 
of the appropriate bundles (properly defined), again 
weighted by the same factor. In both cases one can 
classify the contributions according to the cohomology 
class 9 on X in which the image of the holomorphic 
map is contained. The sum of the numbers obtained 
for each 8 and fixed g are known as Gromov- Witten 


invariants, N7. The topological string contribution 
takes the form 


20—2 
) q. 


g>0 BEH>(X.Z) 


NT eb” [47] 


where w is the Kahler class of the Calabi-Yau manifold. 
In general, the quantities NE are rational numbers. 

The precedent discussion has shown how Gromov- 
Witten invariants can be interpreted in terms of string 
theory. One could think that this is just a fancy 
observation and that no further insight on these 
invariants can be gained from this formulation. The 
situation turns out to be quite the opposite. Once a string 
formulation has been obtained, the whole machinery of 
string theory is at our disposal. One should look to new 
ways to compute the quantity [47], where Gromov- 
Witten invariants are packed. The hope is that, if this is 
possible, the new emerging picture will provide new 
insights on these invariants. This is indeed what 
occurred recently. lt turns out that the quantity [47] 
can be obtained from an alternative point of view in 
which the embedded Riemann surfaces are regarded as 
D-branes. The outcome of this approach is that the 
Gromov-Witten invariants can be written in terms of 
other invariants which are integers and that possess a 
geometrical interpretation. To be more specific, the 
quantity [47] takes the form 


Y rea) 4h d 


s29  d>0 
GH (X.Z) 


where n” are the new “integer” invariants. This 


prediction has been verified in all the cases in which 
it has been tested. A similar structure will be found 
in the next section in the context of knot theory in 
the large-N limit. 

Let us now consider also Donaldson—Witten theory 
from a new perspective. To be more specific, let us 
consider the case in which the gauge group is SU(2), 
and the 4-manifold X is simply connected and has 
by > 1 (the case bj = 1 is anomalous). In this situation 
there are 1 + b; physical observables [34], © = I and 
I(35,) 2 DD(X4) (a=1,...,b62), where Y, is a basis of 
H2(X). These can be packed in a generating functional: 


exp| Y ^ auI(9;) + AO [49] 


a 


where A and a,(a=1,...,62) are parameters. In 
computing this quantity one can argue that the 
contribution is localized on the moduli space of 
instantons configurations and one ends up, after 
taking into account the selection rule dictated by the 
dimensionality of the moduli space, with integrations 
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over the moduli space of the selected forms. The 
resulting quantities are Donaldson invariants. 

As in the case of topological sigma models one could 
be tempted to argue that the observation leading to a 
field-theoretical interpretation of Donaldson invar- 
iants does not provide any new insight. Quite on the 
contrary, once a field theory formulation is available, 
one has at his disposal a huge machinery which could 
lead, on the one hand, to further generalizations of the 
theory and, on the other hand, to new ways to 
compute quantities such as [49], obtaining new 
insights on these invariants. This is indeed what 
happened in the 1990s, leading to an important 
breakthrough in 1994 when Seiberg and Witten 
calculated [49] in a different way and pointed out the 
relation of Donaldson invariants to new integer 
invariants that nowadays bear their names. 

The localization argument that led to the interpreta- 
tion of [49] as Donaldson invariants is valid because 
the theory under consideration is exact in the weak- 
coupling limit. Actually, the topological theory under 
consideration is independent of the coupling constant 
and thus calculations in the strong-coupling limit are 
also exact. These types of calculations were out of 
reach before 1994. The situation changed dramatically 
after the work of Seiberg and Witten in which VV — 2 
super Yang—Mills theory was solved in the strong- 
coupling limit. Its application to the corresponding 
twisted version was immediate and it turned out that 
Donaldson invariants can be written in terms of new 
integer invariants now known as Seiberg—Witten 
invariants (Witten 1994). The development has a 
strong resemblance with the one described above for 
topological strings: certain noninteger invariants can 
be expressed in terms of new integer invariants. 

The Seiberg-Witten invariants are actually simpler 
to compute than Donaldson invariants. They corre- 
spond to partition functions of topological 
Yang-Mills theories where the gauge group is 
abelian. These contributions can be grouped into 
classes labeled by x 2 —2c;(L), where c¡(L) is the 
first Chern class of the corresponding line bundle. 
The sum of contributions, each being +1, for a given 
class x is the integer Seiberg-Witten invariant ny. The 
strong-coupling analysis of topological Yang-Mills 
theory leads to the following expression for [49]: 


21 (1/4) (7-116) (eta Py ne” 
x 


y jxto/4 e( (0 /2)-24) ` Ny gr [SO] 
x 


where v= »,,0,X,, and x and ø are the Euler 
number and the signature of the manifold X. This 
result matches the known structure of [49] (structure 
theorem of Kronheimer and Mrowka) and provides 
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a meaning to its unknown quantities in terms of the 
new Seiberg-Witten invariants. Equation [50] is a 
rather remarkable prediction that has been tested in 
many cases, and for which a general proof has been 
recently proposed. For a review of the subject, see 
Labastida and Lozano (1998). 

The situation for manifolds with b = 1 involves a 
metric dependence and has been worked out in 
detail (Moore and Witten 1998). The formulation of 
Donaldson invariants in field-theoretical terms has 
also provided a generalization of these invariants. 
This generalization has been carried out in several 
directions: (1) the consideration of higher-rank 
groups, (2) the coupling to matter fields after 
twisting M — 2 hypermultiplets, and (3) the twist of 
theories involving N — 4 supersymmetry. 

We will now look at Chern-Simons gauge theory 
from the perspective that emerges from its treatment 
in the context of the large-N expansion. We will 
restrict the discussion to the case of knots on S? with 
gauge group SU(N). Gauge theories with gauge group 
SU(N) admit, besides the perturbative expansion, a 
large-N expansion. In this expansion correlators are 
expanded in powers of 1/N while keeping the 
't Hooft coupling t= Nx fixed, x being the coupling 
constant of the gauge theory. For example, for the 
free energy of the theory, one has the general form 


BE $ (s li ied [51] 
R 

In the case of Chern-Simons gauge theory, the coupling 
constant is x =27i/(k + N) after taking into account 
the shift in k. The large-N expansion [51] resembles a 
string-theory expansion and indeed the quantities C, ;, 
can be identified with the partition function of a 
topological open string with g handles and h bound- 
aries, with N D-branes on $? in an ambient six- 
dimensional target space T*S°. This was pointed out by 
Witten in 1992. The result makes a connection between 
a topological three-dimensional field theory and the 
topological strings described in the previous section. 

In 1998 an important breakthrough took place 
which provided a new approach to compute quan- 
tities such as [51]. Using arguments inspired by the 
AdS/CFT correspondence, Gopakumar and Vafa 
(1999) provided a closed-string-theory interpretation 
of the partition function [51]. They conjectured that 
the free energy F can be expressed as 


F= >» N'CSB() [52] 
g>0 


where F,(t) corresponds to the partition function of a 
topological closed-string theory on the noncompact 
Calabi-Yau manifold X called the resolved conifold, 


O(-1) & O(-1) — P’, t being the flux of the B-field 
through P'. The quantities F,(t) have been computed 
using both physical and mathematical arguments, 
thus proving the conjecture. 

Once a new picture for the partition function of 
Chern-Simons gauge theory is available, one should 
ask about the form that the expectation values of 
Wilson loops could take in the new context. The 
question was faced by Ooguri and Vafa and they 
provided the answer, later refined by Labastida, 
Marino, and Vafa. The outcome is an entirely new 
point of view in the theory of knot and link 
invariants. The new picture provides a geometrical 
interpretation of the integer coefficients of the 
quantum group invariants, an issue that has been 
investigated during many years. To present an 
account of these developments, one needs to review 
first some basic facts of large-N expansions. 

To consider the presence of Wilson loops, it is 
convenient to introduce a particular generating 
functional. First, one performs a change of basis 
from representations R to conjugacy classes C(k) of 
the symmetric group, labeled by vectors 
k=(Ri,k>,...) with b; >0, and |k| = 2; kj > D. 
The change of basis is W= ^, xn(C(k)) Wn, 
where yx are characters of the permutation group 
Sp of (= S. jk; elements (£ is also the number of 
boxes of the Young tableau associated to R). 
Second, one introduces the generating functional: 


F(V) = log Z(V) = y: C9 Wi?Y.(V) [53] 
poU 
where 
ZV) = NES Wee (V) 
p oU 


r,(V) = | [ (er v^ 


] 


In these expressions |C(k)| denotes the number of 
elements of the class C(k) in Sj. The reason behind 
the introduction of this generating functional is that 
the large-N structure of the connected Wilson loops, 
Wr. turns out to be very simple: 

x wi? E SC ANON [54] 

g=0 

where A=e' and t=Nx is the "t Hooft coupling. 
Writing x=t/N, it corresponds to a power series 
expansion in 1/N. As before, the expansion looks 
like a perturbative series in string theory where g is 
the genus and |k| is the number of holes. Ooguri and 
Vafa conjectured in 1999 the appropriate string- 
theory description of [54]. It corresponds to an open 
topological string theory (notice that the ones 
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described in the previous section were closed), 
whose target space is the resolved conifold X. The 
contribution from this theory will lead to open- 
string analogs of Gromov- Witten invariants. 

In order to describe in more detail the fact that one 
is dealing with open strings, some new data need to 
be introduced. Here is where the knot description 
intrinsic to the Wilson loop enters. Given a knot K on 
S?, let us associate to it a Lagrangian submanifold Cx 
with b; = 1 in the resolved conifold X and consider a 
topological open string on it. The contributions in 
this open topological string are localized on holo- 
morphic maps f : £, p > X with h = |k| which satisfy 
f.[,5] =0, and f.[C] ^ ;[y] for kj oriented circles 
C. In these expressions y € H¡(Cx,Z), and Qc 
H>(X,Cx,Z), that is, the map is such that k; 
boundaries of X, p wrap the knot j times, and X, p 
itself gets mapped to a relative two-homology class 
characterized by the Lagrangian submanifold Cx. 
The number of such maps (in the sense described in 
the previous section) is the open-string analog of 
Gromov- Witten invariants. They will be denoted by 
N2 p. Comparing to the situation that led to [47] in 
the closed-string case, one concludes that in this case 
the quantities F, (A) in [54] must take the form 


F, rl A) = y NE, ela = L W [55] 
Q 


where w is the Kahler class of the Calabi-Yau 
manifold X and \=e’. For any Q, one can always 
write [a w= Ot, where O is in general a half-integer 
number. Therefore, F, ¿(A) is a polynomial in \*!/* 
with rational coefficients. 

The result [55] is very impressive but still does not 
provide a representation where one can assign a 
geometrical interpretation to the integer coefficients 
of the quantum-group invariants. Notice that to 
match a polynomial invariant to [55], after obtain- 
ing its connected part, one must expand it in x after 
setting g=e* keeping A fixed. One would like to 
have a refined version of [55], in the spirit of what 
was described in the previous section leading from 
the Gromov—Witten invariants N; of [47] to the 
new integer invariants n; of [48]. It turns out that, 
indeed, F(V) can be expressed in terms of integer 
invariants in complete analogy with the description 
presented in the previous section for topological 
strings. A good review on the subject can be found 
in Marino (2005). 


Concluding Remarks 


In this overview we have introduced key features of 
TQFTs and we have described some of the most 
relevant results emerged from them. We have 


described how the many faces of TQFT provide a 
variety of important insights in a selected set of 
problems in topology. Among these outstand the 
reformulation of Donaldson theory and the discovery 
of the Seiberg-Witten invariants, and the string-theory 
description of the large-N expansion of Chern-Simons 
gauge theory, which provides an entirely new point of 
view in the study of knot and link invariants and points 
to an underlying fascinating interplay between string 
theory, knot theory, and enumerative geometry which 
opens new fields of study. 

In addition to their intrinsic mathematical inter- 
est, TQFTs have been found relevant to important 
questions in physics as well. This is so because, in a 
sense, TQFTs are easier to solve than conventional 
quantum field theories. For example, topological 
sigma models are relevant to the computation of 
certain couplings in string theory. Also, Witten-type 
gauge TQFTs such as Donaldson—Witten theories 
and its generalizations play a role in string theory as 
effective world-volume theories of extended string 
states (branes) wrapping curved spaces, and TQFTs 
arising from MN —4 gauge theories in four dimen- 
sions have shed light on field- (and string-) theory 
dualities. 

Most of these developments, and others that we 
have not touched upon or only mentioned in passing 
have their own entries in the encyclopedia, to which 
we refer the interested reader for further details. 


See also: Axiomatic Approach to Topological Quantum 
Field Theory; BF Theories; Chern-Simons Models: 
Rigorous Results; Donaldson-Witten Theory; Gauge 
Theoretic Invariants of 4-Manifolds; Gauge Theory: 
Mathematical Applications; Hamiltonian Fluid Dynamics; 
The Jones Polynomial; Knot Theory and Physics; 
Mathai-Quillen Formalism; Mathematical Knot Theory; 
Schwarz-Type Topological Quantum Field Theory; 
Seiberg-Witten Theory; Stationary Phase Approximation; 
Topological Sigma Models. 
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Introduction 


Topological sigma models govern the quantum 
mechanics of maps from a Riemann surface X to a 
target space M. In contrast to the standard super- 
symmetric sigma model, the topological version has a 
special local shift symmetry. This symmetry takes the 
form u = €, where e is an arbitrary local function of 
the coordinates on the base manifold X. In essence, this 
topological shift symmetry ensures that all local 
degrees of freedom of the model can be gauged away. 
As a result, the dynamics of such a model resides in a 
finite number of global topological degrees of freedom. 
This feature is generic to all topological field theories 
of Witten type, also known as cohomological field 
theories (see Topological Quantum Field Theory: 
Overview). The topological shift symmetry is respon- 
sible for the special topological nature of the model, 
which is seen most readily by BRST quantizing the 
local shift symmetry. This gives rise to a nilpotent 
BRST operator O. The properties of this BRST 
operator are crucial for establishing the topological 
nature of the model. The key point in the construction 
of any cohomological field theory is the fact that the 
full quantum action $4 can be written as a BRST 
commutator $4 = (O, V], where V is a function of the 
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fields needed to define the path integral. In particular, 
one can show that the partition function and all 
correlation functions are independent of the metric on 
both the base manifold © and the target space M. For 
example, let us define the path integral by 


Z= f 9" 11] 


where ® denotes the full set of fields required at the 
quantum level. In general, the function V depends 
on geometric data of both X and M. Nevertheless, 
one can easily establish that the partition function is 
independent of this data by noting the following. 
Variation of Z with respect to the metric of the 
target space g (for example) gives 


6,2 = -[ doe 19. (O,6,V) [2] 


The right-hand side of this equation is nothing but 
the vacuum expectation value of a BRST commu- 
tator, and this vanishes by BRST invariance of the 
vacuum. It is important to note here that the BRST 
operator O can be constructed to be independent of 
g. Apart from the necessity of introducing the metric 
tensor, these models also require additional geo- 
metric data for their construction. The complex 
structure of X, and at least an almost-complex 
structure on M, is required. By a similar argument, 
one can show that the partition function and 
correlation functions are independent of this extra 


geometric data. As mentioned above, these models 
possess no local degrees of freedom. One can then 
show that the path-integral expression for the 
correlation functions can be localized to a finite- 
dimensional moduli space of instanton configura- 
tions which minimize the classical action. 

We will first show how the full quantum action of 
the theory can be obtained as a BRST quantization of a 
classical action with a local gauge symmetry. How- 
ever, we shall then highlight the fact that the gauge 
algebra for this topological shift symmetry only closes 
on-shell. In order to proceed with a BRST quantization 
of the model, and obtain the complete quantum 
action, one must take recourse to the Batalin- 
Vilkovisky quantization scheme. This machinery is 
ideally tailored for such a problem, with the end result 
that quartic ghost terms are present in the action. 
However, the presence of such terms does not affect 
the arguments presented above, since the quantum is 
still obtained as a BRST commutator. Following this, 
we construct all observables of the theory and 
demonstrate their connection to the de Rham coho- 
mology of the target space. The special topological 
properties of the observables are then discussed, and it 
is shown how their computation is localized to the 
moduli space .M of holomorphic maps from X to M. 
As a particular example, we show how the computa- 
tion of a certain class of observables determines the 
intersection numbers of the moduli space M. We 
present a brief discussion of the connection between 
topological sigma models with Calabi-Yau target 
space M, and the mirror symmetry of M. 


Construction of the Model 


We begin with the following classical action: 
Se =| d'ovh hag gi KK”! [3] 


where 
K% — GY — i (O%u' + e aJ ja u) [4| 
The fields G and K” both satisfy the self-duality 


constraint 
Gu = pe gG” s 
Ku — py aiK” 


where the self-dual and anti-self-dual projection 
operators are defined as 


Poy =} (8% 98; + eg) 6 


The above action describes a theory of maps u'(c) 
from a Riemann surface X to an almost complex 
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manifold M. The coordinates on X are denoted by 
o*(a=1,2), while those on the target manifold M 
are denoted by z'/(i— 1,..., dim M). The metric and 
complex structure of X are denoted by hag and e” y, 
respectively; they obey the relations “gef, = —6°, 
and e,5— bh4,€?3. The metric tensor gj and almost- 
complex structure J’; of M obey analogous relations 
to the above. In the general model, the target space 
need only be an almost-complex manifold. This 
requires the existence of a globally defined tensor 
field J’; such that J^, = —6',. 

The action [3] is invariant under the topological 
shift symmetry 


óu! = e [7] 


where e is an arbitrary local function of the 
coordinates on the base manifold X. Already, at 
this level, we see the distinction with the standard 
sigma model. The presence of this shift symmetry 
means that all local degrees of freedom can be 
gauged away, leaving only a finite number of global 
topological degrees of freedom. It requires some 
work to determine the corresponding transformation 
for G”, the key point being the preservation of the 
self-duality constraint. We find 


6G^ =P% y (DPE +44, (Di ,)8*u) 
tlegé(DJ,;)G5—riec" [8] 


where the covariant derivative 
Dae = One! + Ti, (Oa! )e*. 

Having determined the classical symmetries of the 
model, we can now proceed with the BRST quantized 
form of the quantum action. As a topological field 
theory of Witten type, one can show that the quantum 
action can be written as a BRST commutator, that is, 
Sa = (Q, V}, where the gauge fermion V is defined by 


V= / Povh Ca (9% - zB”) [9] 


is defined by 


where a is an arbitrary gauge-fixing parameter. The 
BRST operator O is nilpotent O* = 0, off-shell. It is 
defined by $09 =e(O, 9], and takes the form 


Sul == eC 

6C —0 
6C, =€ (s oi + suf (D4J/) C;jC* + SCA C ) 
5B“ = zc C (Rè Ry JF) Cot [10] 
T 5€ 8 (Di) Ck p?i 
+ (c D Wi) G D; j) C^ + Ti, OB% 
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In the above, the ghost field is denoted by C', while 
the anti-ghost field Ca; and the multiplier field Bai 
obey the self-duality constraint [5]. The key point to 
note in the above transformations is the fact that the 
ghost field C' is BRST invariant. Again, this is a 
feature which is generic to all cohomological field 
theories. The existence of such a field allows the 
construction of an entire set of topological correla- 
tion functions, as we shall see in the following 
section. 

While the gauge-fixing parameter o is arbitrary, a 
conventional choice is to take @a=1, and then 
integrate out the multiplier field B. This yields the 
action in the form 


1 E NE ae 
Sq = / d^ovbh > b^? gi, wu Ogu! 2 7 e"? T:0, 1 Ogu! 
y ; 1 ; ; ; 
+ Cai (D° C +5 ea (Di'a) Ou! c) 


T. es 


LC. CU Ras 


pu SEI 


+ OI JC [11] 
It should be stressed that the classical gauge algebra 
[7] and [8] only closes on-shell. Quantization of the 
model is therefore more subtle, and requires use of 
the Batalin-Vilkovisky formalism. The on-shell 
closure problem automatically results in the pre- 
sence of quartic ghost coupling terms in the action 
and consequently cubic terms in the BRST transfor- 
mations. Despite this, we have established that the 
full quantum action can be written as a BRST 
commutator. 

The form of the action simplifies when the 
complex structure of the target manifold is 
covariantly constant, Di — 0. In this case, the 
target manifold M is Kahler and we denote the 
complex coordinates as zl, with their complex 
conjugates denoted by u’. The nonzero compo- 
nents of the metric tensor are then gy. Similarly, 
the coordinates of X are denoted o*, with nonzero 
metric components 5b, . The nonzero components 
of the ghost and anti-ghost are then given by 
Cl, C, C,j, C. j. The action can be written in the 
form 


Sq = [ov Lr gba. ad +50 (D. Dp gy 


4 zC (D,C))h* gy 


a ja 


alt CC Ray? d [12] 


Construction of Observables 


Having defined the quantum action, it is now of 
interest to consider the correlation functions of the 
model. In the functional integral, we integrate over 
all maps X — M in a fixed homotopy class. Let us 
consider a correlation function 


(O) = j du dC, dC eo 13 


where t > 0 is a parameter, and the observable O is 
BRST invariant (OQ, O] — 0. From the BRST invar- 
iance of the vacuum, it follows immediately that the 
vacuum expectation value of a BRST commutator is 
zero, ({O,O})=0. An operator which is a BRST 
commutator is said to be Q-exact. Hence, our 
interest is in the O-cohomology classes of operators, 
that is, BRST invariant operators modulo BRST 
exact operators. It is for this reason that such a 
model is called a cohomological field theory. 

One can now show that the variation of [13] with 
respect to t is a BRST commutator, namely 


6,00) = —6t | ddC,dCe"*(Q,VO)-0 [14 


As a result, one can evaluate the correlation function 
in the large-t (weak-coupling) limit. In this limit, the 
path integral is dominated by fluctuations around 
the classical minima. For the sigma model under 
study, the classical action is minimized by the 
instanton configurations 


On! + Ea] ¡Ogu' = 0 [15] 


Indeed, this localization of the path integral to the 
moduli space of instantons can also be seen by 
choosing the a=0 gauge in [9]. Integration over the 
multiplier field then imposes a delta function 
constraint to the instanton configurations. The key 
point in the above derivation is the fact that the 
quantum action is a BRST commutator, $4 = {Q, V]. 
By a similar argument, one can show that variations 
of (O) with respect to the metric and complex 
structure of X and M are also zero. 

Our aim now is to construct the O-cohomology 
classes of operators in the theory. Let us first associate 
an operator e to each p-form A = Aj,..;, du! A+++ ^ 
du’ on the target space M, given by 


OU = Aj uj, Ch ss CP [16] 


where C' is the ghost field. Under a BRST 
transformation, we see that 


LO, om 2 =O} Ai, Pa =e Cr 
= 0% [17] 


since the ghost fields are BRST invariant by [10]. 
Hence, D is BRST invariant if and only if A is a 
closed p- dom Similarly, if A is an exact p-form, 
then the corresponding operator is Q-exact. Hence, 
the BRST cohomology classes of these operators are 
in one to one correspondence with the de Rham 
cohomology classes on M. The reason for assigning 
the peculiar superscript to the operator OU will 
become clear at the end of this construction. Notice 
also that operators of the form (uA can be used as 
building blocks for constructing new observables. If 
we consider a set of closed forms A4,...,A,, then 
the product of the associated operators O 10) Oo 
is clearly Q-invariant as well. 

When considering the vacuum expectation values 
of operators which are polynomials in the fields, 
there is an implicit dependence on the points where 
the operators are located. In the case at hand 
however, the operator Qu (c) at the point c has a 
vacuum expectation value which is a topological 
invariant, and thus cannot depend on the chosen 
point. To see this explicitly, we consider all fields 
defined over X, and differentiate the operator with 
respect to some local coordinates c^: 


O ; : ðu’? 


aga Ai C” IL (LE (Oi, Ai,...i, ) Aga IT C 
: Au" . x 
«ow rh 1j 2m... CT 
+ pA; i (0 0) 5. C8 C 18] 


In terms of exterior derivatives, this takes the form, 


dOP = 8, Aii du^ C^ -+ C^ + pAj,..;, dC C -+C 


- (9,04) [19] 
where OW = —pAj,...;, du! C^ -.-C^, and we have 
used the fact that A is a closed p-form. If we let y 
represent any path between two arbitrary points P 
and P’, then this expression has the integral form, 


ja jo [0%] 20] 


and we see that the vacuum expectation value of 
Qu i is point independent by the BRST invariance of 
the vacuum. The same remark applies to any 
product of operators of the form we are considering. 
To continue our construction, consider a one- 
dimensional homology cycle y(9y= 0), and define 


Wa (1) = f Or i21] 
T 


This new operator Wi" (v) is BRST invariant by 
inspection, 


oO (P) — ot (p 
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(o wo) - [ {2.0%} = [ ao =0 g2 


Moreover, if y happens to be the boundary of a two- 
dimensional surface (^ — 958), so that y is trivial in 
homology, then this new operator is likewise trivial 
in O cohomology: 


WU (4) = [0% = [ 40? = to. fop) [23] 
^f i i 


where 


o? _ bp- Da 


= 5 Dis du^ Adu” C^ ... Cr 


As before, let us now associate to each homology 
wi B(0B — 0), another BRST invariant operator 
! defined by 


Wi (8 j= | ee [24] 


The BRST invariance follows trivially as in [23]. 

In summary, we have produced three operators 
Ov ur a nd e from any given closed form A, 
which Ms the diae 


0= 10,00" V, 
so) - (0.0%) 


The BRST observables are then given by ca ep 
products of the integrated operators wo (y) = 
» oe where y is any ¿-cycle in homology. 


Observables and Intersection Theory 


Let us consider the computation of the correlation 
function (O) in the background field method. We 
first pick a background instanton configuration [15], 
and then integrate over the quantum fluctuations 
around that instanton. The relevant part of the 
quantum action is quadratic in the quantum fields, 
and localization of the model then ensures that such 
a computation is exact. The quantum fields are 
expanded into eigenfunctions of the operators that 
appear in the quadratic part of the action, and the 
functional integral is replaced by an integral over the 
eigenmodes. However, if there are fermionic zero 
modes, then those modes do not enter in the action. 
As a result, the fermionic integrals (f dy —0) over 
those modes will cause (O) to vanish unless it has 
the correct fermion content; the zero modes must be 
absorbed. In our case, a glance at the quantum 
action indicates that we should concern ourselves 
with the zero modes of the ghost C' and anti-ghost 


294 Topological Sigma Models 


Cai. A C! zero mode is clearly in the kernel of the 
operator 


D^; = Dad + eag] ¡DP + eap (D;]',)0?^w* [26] 


and a Ca; zero mode is a zero eigenfunction of its 
adjoint D*. In the BRST quantization of the model, 
the ghost fields C' are assigned ghost number +1, 
while the anti-ghost fields Ca; have ghost number 
—l. It is therefore apparent that the vacuum 
expectation value of any observable will vanish 
unless that observable has a ghost number equal to 
the number of D zero modes, a, minus the number 
of D* zero modes, b. This difference, w=a — b, is 
called the index of the operator D. 

There is a direct link between this index and the 
dimension of the moduli space of instantons. Recall 
that we are considering the space of maps E > M ina 
specified homotopy class, which satisfy equation [15]. 
It is then of interest to determine the dimension of the 
space of such solutions. To this aim, we examine the 
constraint that arises by considering an instanton z/, 
and another neighboring solution w + a’, where iz’ is 
an infinitesimal deformation. To first order in úl, we 
see that # must be a zero mode of the operator D. 
This is no coincidence, and we can thus interpret the 
ghost fields C' as cotangent vectors to instanton 
moduli space M. In particular, if M is a smooth 
manifold, then dim M =a. The index of the operator 
D is called the virtual dimension of the moduli space. 
In generic situations, the virtual dimension is equal to 
the actual dimension dim M. 

It is possible to interpret some of the observables 
that we have described in terms of intersection 
theory applied to the moduli space of instantons. In 
particular, one can show that all correlation func- 
tions of the form 


(on) OQ 27 


are intersection numbers of certain submanifolds of 
moduli space. In order to see this in a simple 
example, we first recall the notion of Poincaré 
duality and the relationship between cohomology 
and homology. 

Poincaré duality can be formulated as a relation- 
ship between de Rham cohomology (defined in 
terms of closed differential forms) and homology 
(defined in terms of subspaces of M). For our 
purposes here, it is sufficient to state that we can 
associate to each boundaryless submanifold N of 
codimension k, a cohomology class [4] € H^(M), 


such that 
f e^v- [v [28] 
JM JN 


for all [v] € H"-*(M). By v on the right-hand side of 
this equation, we mean the pullback i*w under the 
inclusion i: N —^ M. Conversely, to each closed 
k-form ¢ on M, we can associate an (n — k)-cycle 
N (it is in general a chain of subspaces), unique up 
to homology, such that the previous relation is 
satisfied. Furthermore, one can show that the 
Poincaré dual to N can be chosen in such a way 
that its support is localized within any given open 
neighborhood of N in M (essentially delta function 
support on N). 

Let us now define the notion of transversal 
intersection. For simplicity, we will first consider 
the intersection of two submanifolds M, and M; 
contained in M. We will say that these two 
submanifolds have transversal intersection if the 
tangent spaces satisfy 


T.(M1) T T. (M3) - T,(M) [29] 


for all x € Mı N Mo. It is a theorem that a submanifold 
of codimension k can be locally “cut-out” by k smooth 
functions, that is, the submanifold is locally specified by 
the zeros of this set of functions. It is a worthwhile 
exercise to convince oneself that the definition of 
transversal intersection is equivalent to the statement 
that the functions which cut-out M, are independent 
from those which cut-out M». Thus, we can write 


codim(M; N M2) = codim(M;) + codim(M;) [30] 


More generally, we say that the intersection M, N --- N 
M, of s submanifolds is transversal if the intersection of 
every pair of them is transversal. It then follows 
trivially by the previous argument that the codimen- 
sions must satisfy 


codim(M; N --- N Ms) = Y 'codim(M;) [31] 
=] 


The special case which will be important for us 
occurs when the intersection of submanifolds is a 
collection of points, that is, when the codimension 
of the intersection is equal to the dimension of M. 
Since these points are isolated, the compactness of 
M guarantees that they are finite in number. 

We are now in a position to describe in what sense 
correlation functions of the form D OW ) 
determine intersection numbers in the moduli space 
M of instantons. By definition, this moduli space is 
the set of maps from X to M which satisfy [15]. Let 
us consider the generic situation, where the virtual 
dimension of M (i.e., the index of D) is equal to 
dim M. For convenience, let us begin by choosing the 
forms A; which represent de Rham cohomology 
classes on M, together with their Poincaré duals M;, 
such that the forms have essentially delta function 


support on their respective submanifolds. Since each 
of the operators in the correlation function depends 
on some fixed point c;, it is meaningful to define the 
submanifolds L; = {u € M |u(c;) € Mj] € M. Now, 
the correlation function represents a functional 
integral over the space of maps Map(X, M), and we 
have argued that this integral only receives contribu- 
tions from the instanton configurations. Since the 
operators A;(4(o;)) vanish unless u € L; by our choice 
of the Poincaré duals, we see that the only contribu- 
tion to the functional integral can be from those maps 
which lie in the intersection L;M---QL,. By ghost 
number considerations, this correlation function must 
vanish unless the codimension of the intersection 
equals the virtual dimension of .M. In the generic 
case where the virtual dimension is equal to dim M, 
this means that the intersection is simply a finite 
number of points. Intersection numbers +1 can then 
be assigned to each point in the intersection L4 N 
-N Ls, by considering the relative orientation of the 
submanifolds L; at the intersection points. From the 
functional integral point of view, the computation 
reduces to an evaluation of the ratio of the bosonic 
determinant (integration over z/) to the fermionic 
determinant (integration over C! and C,j) In the 
Kahler case, for example, the intersection number 
assigned to each point in the intersection is always +1. 
This is due to the fact that the C!, C. / determinant is 
the complex conjugate of the C’, o, determinant. 


A and B Models and Mirror Symmetry 


The topological sigma model for a Kahler target 
space [12] is also known as the topological A model. 
In this case, the action can be recovered by twisting 
the standard N —2 supersymmetric sigma model. 
This twisting procedure amounts to a reassignment 
of the spins of the fields in the theory. However, 
there is an alternative twisting which can be done, 
and this leads to another model known as the 
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Introduction 


Turbulence has initially been defined as an irregular 
motion in fluids. The cloud formations in the 
atmosphere and the motion of water in rivers make 
this point clear. These are but a few readily available 
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topological B model. The usefulness of this observa- 
tion lies in the fact that the topological A model on a 
Calabi-Yau target space M is related to the 
topological B model on the mirror of M. This 
relationship and the computation of correlation 
functions in the A and B models thus sheds light 
on the nature of mirror symmetry. 
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examples of a multitude of flows which display 
turbulent regimes: from the blood that flows in our 
veins and arteries to the motion of air within our 
lungs and around us; from the flow of water in 
creeks to the atmospheric and oceanic currents; 
from the flows past submarines, ships, automobiles, 
and aircraft to the combustion processes propelling 
them; and in the flow of gas, oil, and water, from 
the prospecting end to the entrails of the cities. The 
great majority of flows in nature and in engineering 
applications are somehow turbulent. 
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Figure 1 Illustration of the irregular motion of a turbulent flow 
over a flat plate (thin lines), and of the well-defined velocity 
profile of the mean flow (thick lines). 


But turbulent flows are much more than simply 
irregular. More refined definitions were desirable 
and were later coined. A definitive and precise one, 
however, may only come when the phenomenon is 
fully understood. Nevertheless, several characteristic 
properties of a turbulent flow can be listed: 


Irregularity and unpredictability A turbulent flow 
is irregular both in space and time, displaying 
unpredictable, random patterns. 

Statistical order From the irregularity of a turbu- 
lent motion there emerges a certain statistical 
order. Mean quantities and correlation are regular 
and predictable (Figure 1). 

Wide range of active scales A wide range of scales of 
motion are active and display an irregular motion, 
yielding a large number of degrees of freedom. 

Mixing and enhanced diffusivity The fluid particles 
undergo complicated and convoluted paths, caus- 
ing a large mixing of different parts of fluid. This 
mixing significantly enhances diffusion, increasing 
the transport of momentum, energy, heat, and 
other advected quantities. 

Vortex stretching When a moving portion of fluid 
also rotates transversally to its motion an increase 
in speed causes it to rotate faster, a phenomenon 
called vortex stretching. This causes that portion 
of fluid to become thinner and elongated, and fold 
and intertwine with other such portions. This is 
an intrinsically three-dimensional mechanism 
which plays a fundamental role in turbulence 
and is associated with large fluctuations in the 
vorticity field. 


Turbulent Regimes 


Turbulence is studied from many perspectives. The 
subject of “transition to turbulence” attempts to 
describe the initial mechanisms responsible for the 
generation of turbulence starting from a laminar 
motion in particular geometries. This transition can 
be followed with respect to position in space (e.g., 
the flow becomes more complicated as we look 
further downstream on a flow past an obstacle or 


over a flat plate) or to parameters (e.g., as we 
increase the angle of attack of a wing or the pressure 
gradient in a pipe). This subject is divided into two 
cases: wall-bounded and free-shear flows. In the 
former, the viscosity, which causes the fluid to 
adhere to the surface of the wall, is the primary 
cause of the instability in the transition process. In 
the latter, inviscid mechanisms such as mixing layers 
and jets are the main factors. The tools for studying 
the transition to turbulence include linearization of 
the equations of motion around the laminar solu- 
tion, nonlinear amplitude equations, and bifurcation 
theory. 

“Fully developed turbulence,” on the other hand, 
concerns turbulence which evolves without imposed 
constraints, such as boundaries and external forces. 
This can be thought of turbulence in its “pure” 
form, and it is somewhat a theoretical framework 
for research due to its idealized nature. Hypotheses 
of homogeneity (when the mean quantities asso- 
ciated with the statistical order characterizing a 
turbulent flow are independent in space), stationar- 
ity (idem in time), and isotropy (idem with respect 
to rotations in space) concern fully developed 
turbulent flows. The Kolmogorov theory was devel- 
oped in this context and it is the most fundamental 
theory of turbulence. Current research is dedicated 
in great part to unveil the mechanisms behind a 
phenomenon called intermittency and how it affects 
the laws obtained from the conventional theory. 
Research is also dedicated to derive such laws as 
much from first principles as possible, minimizing 
the use of phenomenological and dimensional 
analysis. 

Real turbulent flows involve various regimes at 
once. A typical flow past a blunt object, for 
instance, displays laminar motion at its upstream 
edge, a turbulent boundary layer further down- 
stream, and the formation of a turbulent wake 
(Figure 2). The subject of turbulent boundary layer 
is a world in itself with current research aiming to 
determine mean properties of flows over rough 
surfaces and varied topography. Convective turbu- 
lence involves coupling with active scalars such as 
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Figure 2 Illustration of a flow past an object, with a laminar 
boundary layer (light gray), a turbulent boundary layer (medium 
gray), and a turbulent wake (dark gray). 


large heat gradients, occurring in the atmosphere, 
and large salinity gradients, in the ocean. Geophy- 
sical turbulence involves also stratification and the 
anisotropy generated by Earth’s rotation. Anisotro- 
pic turbulence is also crucial in astrophysics and 
plasma theory. Multiphase and multicomponent 
turbulence appear in flows with suspended particles 
or bubbles and in mixtures such as gas, water, and 
oil. Transonic and supersonic flows are also of great 
importance and fall into the category of compres- 
sible turbulence, much less explored than the 
incompressible case. 

In all those real situations one would like, from the 
engineering point of view, to compute mean proper- 
ties of the flow, such as drag and lift for more 
efficient designs of aircraft, ships, and other vehicles. 
Knowledge of the drag coefficient is also of funda- 
mental importance in the design of pipes and pumps, 
from pipelines to artificial human organs. Mean 
turbulent diffusion coefficients of heat and other 
passive scalars — quantities advected by the flow 
without interfering on it, such as chemical products, 
nutrients, moisture, and pollutants — are also of 
major importance in industry, ecology, meteorology, 
and climatology, for instance. And in most of those 
cases a large amount of research is dedicated to the 
“control of turbulence,” either to increase mixing 
or reduce drag, for instance. From a theoretical 
point of view, one would like to fully understand 
and characterize the mechanisms involved in 
turbulent flows, clarifying this fascinating phe- 
nomenon. This could also improve practical appli- 
cations and lead to a better control of turbulence. 

The concept of “two-dimensional turbulence" is 
controversial. A two-dimensional flow may be 
irregular and display mixing, statistical order, and 
a wide range of active scales but definitely it does 
not involve vortex stretching since the velocity field 
is always perpendicular to the vorticity field. For this 
reason many researchers discard two-dimensional 
turbulence altogether. It is also argued that real 
two-dimensional flows are unstable at complicated 
regimes and soon develop into a three-dimensional 
flow. Nevertheless, many believe that two-dimensional 
turbulence, even lacking vortex stretching, is of 
fundamental theoretical importance. It may shed 
some light into the three-dimensional theory and 
modeling, and it can serve as an approximation to 
some situations such as the motion of the atmos- 
phere and oceans in the large and meso scales and 
some magnetohydrodynamic flows. The relative 
shallowness of the atmosphere and oceans or the 
imposition of a strong uniform magnetic field may 
force the flow into two-dimensionality, at least for 
a certain range of scales. 
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“Chaos” serves as a paradigm for turbulence, in 
the sense that it is now accepted that turbulence is a 
dynamic processes in a sensitive deterministic 
system. But not all chaotic motions in fluids are 
termed turbulent for they may not display mixing 
and vortex stretching or involve a wide range of 
scales. An important such example appears in the 
dispersive, nonlinear interactions of waves. 


The Equations of Motion 


It is usually stressed that turbulence is a continuum 
phenomenon, in the sense that the active scales are 
much larger than the collision mean free path 
between molecules. For this reason, turbulence is 
believed to be fully accounted for by the Navier- 
Stokes equations. 

In the case of incompressible homogeneous flows, 
the Navier-Stokes equations in the Eulerian form 
and in vector notation read 

Ou 


3; VAu+ (us V)u Vp —f [La] 


V-u=0. [1b] 


Here, u=u(x,t)=(u1,u2,u3) denotes the velocity 
vector of an idealized fluid particle located at 
position x= (x1,x2,x3), at time £. The mass density 
in a homogeneous flow is constant, denoted p. The 
constant y denotes the kinematic viscosity of the 
fluid, which is the molecular viscosity u divided by 
p. The variable p — p(x,t) is the kinematic pressure, 
and f =f(x,t)= (fi, f, f3) denotes the mass density 
of volume forces. 

Equation [1a] expresses the conservation of linear 
momentum. The term v Az accounts for the dissipa- 
tion of energy due to molecular viscosity, and the 
nonlinear term (u- V)u, also called the inertial term, 
accounts for the redistribution of energy among 
different structures and scales of motion. Equation 
[1b] represents the incompressibility condition. In 
Einstein's summation convention, these equations 
can be written as 


Ou; Oru; Ou; 


Op, 
Or Oe" lx 


"i Ba 7 Bw 


The Reynolds Number 


The transition to turbulence was carefully studied by 
Reynolds in the late nineteenth century in a series of 
experiments in which water at rest in a tank was 
allowed to flow through a glass pipe. Starting with 
dimensional analysis, Reynolds argued that a critical 
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value of a certain nondimensional quantity was 
likely to exist beyond which a laminar flow gives 
rise to a “sinuous” motion. This was followed by 
observations of the flow for tubes with different 
diameter L, different mean velocities U across the 
tube section, and with the kinematic viscosity 
v=p/ being altered through changes in tempera- 
ture. The experiments confirmed the existence of 
such a critical value for what is now called the 
Reynolds number: 
Re = zii 
V 

The dimensional analysis argument can be repro- 
duced in the following form: the physical dimension 
for the inertial term in [1a] is U^/L, while that for 
the viscous term is vU/L?. The ratio between them 
is precisely Re— LU/v. For small values of Re 
viscosity dominates and the flow is laminar, whereas 
for large values of Re the inertial term dominates, 
and the flow becomes more complicated and 
eventually turbulent. In applications, different types 
of Reynolds number can be used depending on the 
choice of the characteristic velocity and length, but 
in any case, the larger the Reynolds number, the 
more complicated the flow. 


The Reynolds Equations 


Another advance put forward by Reynolds in a 
subsequent article was to decompose the flow into a 
mean component and the remaining fluctuations. In 
terms of the velocity and pressure fields this can be 
written as 


p=pt+p (2) 
with 4 and p representing the mean components and 
u' and p’, the fluctuations. By substituting [2] into 


[1], one finds the Reynolds-averaged Navier-Stokes 
(RANS) equations for the mean flow: 


> —vAu+(u-Vju+Vp=f+V-" 


V.u=0 


u-—u-w, 


It differs from [1] only by the addition of the 
Reynolds stress tensor: 


— "ENS. 
T= —W Qu! = - (win) 
1) ij=1 
In a laminar flow, the fluctuations are negligible, 
otherwise this decomposition shows how they 
influence the mean flow through this additional 
turbulent stresses. 


The Closure Problem and Turbulence 
Models 


The RANS equations cannot be solved directly for the 
mean flow since the Reynolds stresses are unknown. 
Equations for these stress terms can be derived but they 
involve further unknown moments. This continues 
with equations for moments of a given order depend- 
ing on new moments up to a higher order, leading to 
an infinite system of equations known as the Fried- 
man-Keller system. For practical applications, 
approximations closing the system at some finite 
order are needed, in what is called the closure problem. 
Several ad hoc approximations exist, the most famous 
being the Boussinesq eddy-viscosity approximation, in 
which the turbulent fluctuations are regarded as 
increasing the viscosity of the flow. Prandtl's mixing- 
length hypothesis yields a prescription for the compu- 
tation of this eddy viscosity, and together they form the 
basis of the algebraic models of turbulence. Other 
models involve additional equations, such as the k-e 
and k-w models. Most of the practical computations of 
industrial flows are based on such lower-order models, 
and a large amount of research is done to determine 
appropriate values for the various ad hoc parameters 
which appear in these models and which are highly 
dependent on the geometry of the flow. This depen- 
dency can be explained by the fact that the RANS is 
supposed to model the mean flow even at the large 
scales of motion, which are highly affected by the 
geometry. 

Computational fluid dynamics (CFD) is indeed a 
fundamental tool in turbulence, both for research and 
engineering applications. From the theoretical side, 
direct numerical simulations (DNS), which attempt to 
resolve all the active scales of the flow, reveal some 
fundamental mechanisms involved in the transition to 
turbulence and in vortex stretching. As for applica- 
tions, DNS applies to flows up to low-Reynolds 
turbulence, with the current computational power 
not allowing for a full resolution of all the scales 
involved in high-Reynolds flows. And the current rate 
of evolution of computational power predicts that 
this will continue so for several decades. 

An intermediate CFD method between RANS and 
DNS is the large-eddy simulation. (LES), which 
attempts to fully resolve the large scales while 
modeling the turbulent motion at the smaller scales. 
Several models have been proposed which have their 
own advantages and limitations as compared to 
RANS and DNS. It is currently a subject of intense 
research, particularly for the development of suitable 
models for the structure functions near the boundary. 
Theoretical results on fully developed turbulence play 
a fundamental role in the modeling process. 


LESs are a promising tool and they have been 
successfully applied to a number of situations. The 
choice of the best method for a given application, 
however, depends very much on the Reynolds 
number of the flow and the prior knowledge of 
similar situations for adjusting the parameters. 


Elements of the Statistical Theory 


Several types of averages can be used. The ensemble 
average is taken with respect to a number of experi- 
ments at nearly identical conditions. Despite the 
irregular motion of, say, the velocity vector ul” (x, t) 
of each experiment n= 1,..., N, the average value 


N 
Su” (x,t) 
n=1 


is expected to behave in a more regular way. This 
type of averaging is usually denoted with the symbol 
(-). This notion can be cast into the context of a 
probability space (M, X, P), where M is a set, X is a 
o-algebra of subsets of M, and P is a probability 
measure on X. The velocity field is a random 
variable in the sense that it is a density function 
wrsu(x,t,w) from M into the space of time- 
dependent divergence-free velocity fields. The mean 
velocity field in this context is regarded as 


u(x,t) = 


et ues 


ad) = $ u(x, t,w)dP(w) 


Other flow quantities such as energy and correla- 
tions in space and time can be expressed by means 
of a function p=y(u(-,-)) of the velocity field, 
with their mean value given by 


(elu, DD) = [| lal, APL) 

In general, the statistics of the flow are allowed to 
change with time. A particular situation is when 
statistical equilibrium is reached, so that (u(x, t)) 
and, more generally, (p(u(-,- + t))) are independent 
of t. In this case, an ergodic assumption is usually 
invoked, which means that for “most” individual 
flows u(-,-,wo) (i.e, for almost all wo with respect to 
the probability measure P), the time averages along 
this flow converge to the mean ensemble value as 
the period of the average increases to the mean value 
obtained by the ensemble average: 


1 T 
lim | plu(-,- +s,wo))ds 
Jo 


" J plul, -,w))dP(w) 
JM 
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Based on this assumption, the averages may in 
practice be calculated as time averages over a 
sufficiently large period T. There is a related 
argument for substituting space averages by time 
averages and based on the mechanics of turbulence 
which is called the “Taylor hypothesis.” 

Another fundamental concept in the statistical 
theory is that of homogeneity, which is the spatial 
analog of the statistical equilibrium in time. 
In homogeneous turbulence, the statistical quantities 
of a flow are independent of translations in space, 
that is, 


(plu(- +£, -)) = (p(u(-, -)) 


for all / € R?. The concept of isotropic turbulence 
assumes further independence with respect to 
rotations and reflections in the frame of reference, 
that is, 


(p(OM(O "as ) = (p(u( d M )) 


for all orthogonal transformations O in R?, with 
adjoint O". 

Under the homogeneity assumption, mean quan- 
tities can be defined independently of position in 
space, such as the mean kinetic energy per unit mass 


1 1 
e = 5 (Im) =>) Gon) 


and the mean rate of viscous energy dissipation per 
unit mass and unit time 
) 


3 3 
e— v (IVux)) 2v: ( 
i=] ij=1 

The mean kinetic energy can be written as 
e=trR(0)/2, where 


trR(£) = R11 (£) + R22 (4) + R33(8), 


is the trace of the correlation tensor 
3 
R(£) = (u(x) @ u(x + £)) = (Ri(£)); ¡1 
3 
= ((ui(x)uj(x + £))); 1 
which measures the correlation between the velocity 
components at different positions in space. From the 
homogeneity assumption, this tensor is a function 
only of the relative position /. Then, assuming that 


the Fourier transform of trR(/) exists, and denoting 
it by O(x), for s € R2, we have 


—. 
(2x)? Jg 


=2/ S(k)e "dg 
0 


Ouj(x) 


Ox; 


£€R?, 


trR(0) = (k)ei^ dk 
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where S(x) is the energy spectrum defined by 


1 
Sw) = 220 Iso 


Vi > 0 


Q(&)dX(«) 


with dX(«) denoting the area element of the 
2-sphere of radius |x|. Then we can write 


e = 5 (lu(x)[?) = 5 tR() 
af S(x)dx 
0 


By expanding the velocity coordinates into Four- 
ier modes exp(¢-«), with & € |x| <«+dk and 
interpreting them as “eddies” with characteristic 
wave number |x|, the quantity S(K)d« can be 
interpreted as the energy of the component of the 
flow formed by the “eddies” with characteristic 
wave number between « and « + dk. 

Similarly, 


e= 2 | k S(K)dk 
0 


and we obtain the dissipation spectrum 2vK*S(k), 
which can be interpreted as the density of energy 
dissipation occurring at wave number x. 

In the previous arguments it is assumed that the 
flow extends to all the space R?. This avoids the 
presence of boundaries, addressing the idealized case 
of fully developed turbulence. It is sometimes 
customary to assume as well that the flow is 
periodic in space to avoid problems with unbounded 
domains such as infinite kinetic energy. 

The random nature of turbulent flows was greatly 
explored by Taylor in the early twentieth century, 
who introduced most of the concepts described 
above. Another important concept he introduced 
was the Taylor microlength £r, which is a char- 
acteristic length for the small scales based on the 
correlation tensor. A microscale Reynolds number 
based on the Taylor microlength is very often used 
in applications. 


Kolmogorov Theory 


An inspiring concept in the theory of turbulence is 
Richardson's *energy cascade" process. For large 
Reynolds numbers the nonlinear term dominates the 
viscosity according to the dimensional analysis, but 
this is valid only for the large-scale structures. The 
small scales have their own characteristic length and 
velocity. In the cascade process, the inertial term is 
responsible for the transfer of energy to smaller and 
smaller scales until small enough scales are reached 


Figure 3 Illustration of the eddy breakdown process in which 
energy is transferred to smaller eddies and so on until the smallest 
scales are reached and the energy is dissipated by viscosity. 


for which viscosity becomes important (Figure 3). At 
those smallest scales kinetic energy is finally dis- 
sipated into heat. It should be emphasized that 
turbulence is a dissipative process; no matter how 
large the Reynolds number is, viscosity plays a role 
in the smallest scales. 

The Kolmogorov theory of locally isotropic 
turbulence allows for inhomogeneity and anisotropy 
in the large scales, which contain most of the energy, 
assuming that with the cascade transfer of energy to 
smaller scales, the orienting effects generated in the 
large scales become weaker and weaker so that for 
sufficiently small eddies the motion becomes statis- 
tically homogeneous, isotropic, and independent of 
the particular energy-productive mechanisms. He 
proposed that the statistical regime of the small- 
scale eddies is then universal and depends only on v 
and e. The equilibrium range is defined as the range 
of scales in which this universality holds. 

Simple dimensional analysis shows that the only 
algebraic combination of v and e with dimension of 
length is Z, — (72/c) ^, which is then interpreted as 
that near which the viscous effect becomes impor- 
tant and hence most of the energy dissipation takes 
place. The scale /, is known as Kolmogorov 
dissipation length. 

Kolmogorov theory gives particular attention to 
moments involving differences of velocities, such as 
the pth-order structure function 


def 
Sp(£) = 


((u(x + £e) -e — u(x)e)^) 

where e may be taken as an arbitrary unit vector, 
thanks to the isotropy assumption. By restricting the 
search for universal laws for the structure functions 
only for small values of / anisotropy and inhomo- 
geneity are allowed in the large scales. 


The theory assumes a wide separation between 
the energy-containing scales, of order say /o, and the 
energy-dissipative scales, of order £,, so that the 
cascade process occurs within a wide range of scales 
l such that £9 > > £,. In this range, termed the 
inertial range, the viscous effects are still negligible 
and the statistical regime should depend only on e. 
Then, the Kolmogorov “two-thirds law” asserts that 
within the inertial range the second-order correla- 
tions must be proportional to (el), that is, 


Sa(0) = Cx (c£)? 


for some constant Cx known as the Kolmogorov 
constant in physical space (there is a related constant 
in spectral space). The argument extends to higher- 
order structure functions, yielding 


S,(£) = Cp (e£)? 


Kolmogorov’s derivation of these results was not by 
dimensional analysis, it was in fact a more convincing 
self-similarity argument based on the universality 
assumed for the equilibrium range. A different argu- 
ment without resorting to universality assumptions, 
however, was applied to the third-order structure 
function, yielding the more precise “four-fifths law”: 


$3(£) = “el 


The “Kolmogorov five-thirds law” concerns the 
energy spectrum S(x) and is the spectral version of 
the two-thirds law, given by Obukhoff: 


Sis) = CAPO 


The constant Cj is the Kolmogorov constant 
in spectral space. The spectral version of the 
dissipation length is the Kolmogorov wave number 
Ke — (efv)!^. 

A typical distribution of energy in a turbulent 
flow is depicted in Figure 4. The energy is 


Inertial range 


SF ————— 


Equilibrium range 


Turbulence Theories 301 


concentrated on the large scales, while the dissipa- 
tion is concentrated near the Kolmogorov scale /,. 
The four-fifths law becomes visible as a straight line 
in the logarithmic scale. 

A more precise mechanism for the energy cascade 
assumes that in the inertial range, eddies with length 
scale / transfer kinetic energy to smaller eddies during 
their characteristic timescale, also known as circula- 
tion time. If z; is their characteristic velocity, then 
Te = ¢/ue is their circulation time, so that the kinetic 
energy transferred from these eddies during this time is 


ZEE 
Ef Y — = — 
Te f 


In statistical equilibrium, the energy lost to the 
smaller scales equals the energy gained from the 
larger scales, and that should also equal the total 
kinetic energy dissipated by viscous effects. Hence, 
c; = e, and we find 


3 
Up 


En — 


It also follows that 7; = £/u, = (el) !? = e-1/342/3 so 
that the circulation time decreases with the length 
scale and becomes of the order of the viscous 
dissipation time (v/«)'/? precisely when £ ~ £,. 

A similar relation between e and the large scales 
can also be obtained with heuristic arguments: let e 
be the mean kinetic energy and /o, a characteristic 
length for the large scales. Then uo given by e— w/2 
is a characteristic velocity for the large scales, and 
To —fo/ug is the large-scale circulation time. In 
statistical equilibrium, the rate e of kinetic energy 
dissipated per unit time and unit mass is expected to 
be of the order of e/7, hence 


3 
Ho 


GO 


Lo 


which is called the “energy dissipation law.” 


Inertial range 


o AM 
Equilibrium range 


Figure 4 A typical distribution for the energy spectrum S(x) and the dissipation spectrum 2v«?S(«) in spectral space in 
nonlogarithmic and logarithmic scales. The energy is mostly concentrated on the large scales while the dissipation is concentrated 
near the dissipation scale. In the logarithmic scale, the four-fifths law for the energy spectrum stands out as a straight line with 


slope —4/5 over the inertial range. 
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Figure 5 A schematic representation of a flow structure 
displaying a range of active scales and a three-dimensional 
grid with linear dimension /9 and mesh length /,, sufficient to 
represent all the active scales in a turbulent flow. The number of 
degrees of freedom is the number of blocks: (/o//,)?. 


From the energy dissipation law, several relations 
between characteristic quantities of turbulent flows can 
be obtained, such as /o/£, ~ Re?%, for Re = louo/v. 

Now, assuming the active scales in a turbulent 
flow exist down to the Kolmogorov scale /,, one 
needs a three-dimensional grid with mesh spacing /, 
to resolve all the scales, which means that the 
number N of degrees of freedom of the system is of 
the order of N ~ (/o/£,) (see Figure 5). This 
number can be estimated in terms of the Reynolds 
number by N ~ Re?/*, This relation is important in 
predicting the computational power needed to 
simulate all the active scales in turbulent flows. 

Several such universal laws can be deduced and 
extended to other situations such as turbulent 
boundary layers, with the famous logarithmic law 
of the wall. They play a fundamental role in 
turbulence modeling and closure, for the calculation 
of the mean flow and other quantities. 


Intermittency 


The universality hypothesis based on a constant mean 
energy dissipation rate throughout the flow received 
some criticisms and was later modified by Kolmo- 
gorov in an attempt to account for observed large 
deviations on the mean rate of energy dissipation. Such 
phenomenon of intermittency is related to the vortex 
stretching and thinning mechanism, which leads to the 
formation of coherent structures of vortex filaments of 
high vorticity and low dissipation (Figure 6). These 
filaments have diameter as small as the Kolmogorov 
scale and longitudinal length extending from the 
Taylor scale up to the large scales and with a lifetime 
of the order of the large-scale circulation time. 

It has been argued based on experimental evidence 
that intermittency leads to modified power laws 


Figure 6 A portion of rotating fluid gets stretched and thinned 
as the flow speeds up, generating one of many coherent 
structures of high vorticity and low dissipation. 


Sy (£) ox (€, C(p) < p/3, for high-order (p > 3) struc- 
ture functions. The issues of intermittency and 
coherent structures and whether and how they could 
affect the deductions of the universality theory such as 
the power laws for the structure functions are far from 
settled and are currently one of the major and most 
fascinating issues being addressed in turbulence 
theory. Several phenomenological theories attempt to 
adjust the universality theory to the existence of such 
coherent structures. Multifractal models, for instance, 
suppose that the eddies generated in the cascade 
process do not fill up the space and form multifractal 
structures. Field-theoretic renormalization group 
develops techniques based on quantum field renor- 
malization theory. Intermediate asymptotics also 
exploits self-similar analysis and renormalization 
theory but with a somewhat different flavor. Detailed 
mathematical analysis of the vorticity equations is 
also playing a major role in the understanding of the 
dynamics of the vorticity field. 


Mathematical Aspects 
of Turbulence Theory 


From a mathematical perspective, it is fundamental to 
develop a rigorous background upon which to study 
the physical quantities of a turbulent flow. The first 
problem in the mathematical theory is related to the 
deterministic nature of chaotic systems assumed in 
dynamical system theory and believed to hold in 
turbulence. This has actually not been proved for the 
Navier-Stokes equations. It is in fact one of the most 
outstanding open problems in mathematics to deter- 
mine whether given an initial condition for the velocity 
field there exists, in some sense, a unique solution of 
the Navier-Stokes equations starting with this initial 
condition and valid for all later times. It has been 
proved that a global solution (i.e., valid for all later 


times) exists but which may not be unique, and it has 
been proved that unique solutions exist which may not 
be global (i.e., they are guaranteed to exist as unique 
solutions only for a finite time). 

The difficulty here is the possible existence of 
singularities in the vorticity field (vorticity becoming 
infinite at some points in space and time). Depending 
on how large the singularity set is, uniqueness may fail 
in strictly mathematical terms. The existence of 
singularities may not be a purely mathematical 
curiosity, it may in fact be related with the inter- 
mittency phenomenon. Rigorous studies of the vorti- 
city equation may continue to reveal more fundamental 
aspects on vortex dynamics and coherent structures. 

The statistical theory has also been put into a firm 
foundation with the notion of statistical solution of the 
Navier-Stokes equations. It addresses the existence 
and regularity of the probability distribution assumed 
for turbulent flows and of the fundamental elements of 
the statistical theory such as correlation functions and 
spectra. Based on that, a number of relations between 
physical quantities of turbulent flows may be derived 
in a mathematically sound and definitive way. This 
does not replace other theories, it is mostly a 
mathematical framework upon which other techni- 
ques can be applied to yield rigorous results. 

Despite the difficulties in the mathematical theory 
of the NSE some successes have been collected such 
as estimates for the number of degrees of freedom in 
terms of fractal dimensions of suitable sets asso- 
ciated with the solutions of the Navier-Stokes 
equations, and partial estimates of a number of 
relations derived in the statistical theory of fully 
developed turbulence. 


See also: Bifurcations in Fluid Dynamics; Geophysical 
Dynamics; Incompressible Euler Equations: 
Mathematical Theory; Intermittency in Turbulence; 
Inviscid Flows; Lagrangian Dispersion (Passive Scalar); 
Stochastic Hydrodynamics; Variational Methods in 
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Introduction 


Roger Penrose introduced twistor theory as a geome- 
trical framework for basic physics in order to unify 
quantum theory and gravity. This program has had 
many successes along the way, but the long-term goals 
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Turbulence; Viscous Incompressible Fluids: 
Mathematical Theory; Vortex Dynamics; Wavelets: 
Application to Turbulence. 
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of reformulating and superceding the established 
theories of basic physics are still a long way from 
being fulfilled. Nevertheless, the successes have had 
many important applications across mathematics and 
mathematical physics. This article will concentrate on 
three areas of application: integrable systems, geome- 
try, and perturbative gauge theory (via twistor-string 
theory). It is intended to be self-contained as far as 
possible, but the reader may well find it easier to first 
read the article Twistors. 
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Twistor Theory 


A basic motivation of twistor theory is to bring out 
the complex (holomorphic) geometry that underlies 
real spacetime. In general relativity, a spacetime is a 
4-manifold with metric g of signature (1,3), and 
when it is flat, that is, g=dt? — dx? — dy? — dz?, 
where (t, x,y,z) are coordinates on R^, it is called 
Minkowski space. The first appearance of a com- 
plex structure arises from the fact that, at a given 
event, the celestial sphere of light rays (directions of 
zero length with respect to g) naturally has the 
structure of the Riemann sphere, CP!, in such a way 
that Lorentz transformations (linear transformations 
of the tangent space preserving the metric) act on 
this sphere by Mobius transformations. These are 
the maximal group of complex analytic transforma- 
tions of CP}, 

Twistor space extends this idea to the whole of 
Minkowski space. Denoted PT, the twistor space for 
Minkowski space is complex projective 3-space, CP”, 
the space of one-dimensional subspaces of C^; it is a 
three-dimensional complex manifold obtained by add- 
ing a “plane at infinity” to C?. Explicitly, we can 
introduce homogeneous eacrdinates 2 EC" — fo) 
with a=0,1,2,3 but where Z^ ~ AZ^ for A € C — {0}. 
Affine coordinates on a C? chart Z340 can 
be obtained by setting (z1,22,A)=(Z°/Z3,Z!/Z3, 
Z?/Z>). Physically, points of a space corre- 
spond to spinning massless particles in Minkowski 
space. Mathematically, the correspondence can be 
understood as the Klein correspondence. 


The Klein Correspondence 


The correspondence between PT and Minkowski 
space can be extended first to complexified Minkowski 
space so that the coordinates are allowed to take on 
values in C, and then to its conformal compactification 
by including the “light cone at infinity.” It then 
coincides with the classical complex Klein correspon- 
dence. The Klein correspondence is the one-to-one 
correspondence between lines in CP? and points of a 
four complex-dimensional quadric, CM, in CP?. The 
4-quadric CM can be understood as conformally 
compactified complexified Minkowski space. Introdu- 
cing affine coordinates (21,25, A) on PT and (t, x, y, Z) 
on CM, we find that a point (t,x,y,z) in CM 
corresponds to a line in P'T according to 


uy [t-z xciy 1 
22 x—1y t+z À 


Alternatively, fixing (A,z1,22) in these equations 
gives a 2-plane in complex Minkowski space 
corresponding to all the lines in PT through 
(^,21,22). Such 2-planes are called “a-planes.” 


They are totally null (i.e., the tangent vectors not 
only have zero length but are also mutually 
orthogonal) and also self-dual (under the differential 
geometer's notion of Hodge duality). 

This complex correspondence can also be 
restricted to give correspondences for R^ with 
metrics of positive-definite, Euclidean, signature or 
ultrahyperbolic, (2,2), signature. A particular sim- 
plification in Euclidean signature is that the complex 
a-planes intersect the real slice in a point. The 
conformal compactification of Euclidean R* is the 
4-sphere S* given by adding a single point at infinity, 
and so we have a projection p:PT— S4 whose 
fibers are holomorphically embedded CP's. These 
fibers can be characterized as the lines in PT that 
are invariant under a quaternionic complex con- 
jugation which is an antiholomorpic map”: PT — 
PT with no fixed points. (Here quaternionic means 
that on the nonprojective twistor space, T — C^, the 
conjugation has the property Z— —Z so that it 
defines a second complex structure anticommuting 
with the standard one; this is sufficient to express 
T=0Q%, where Q denotes the quaternions. The 
complex structures i, j, and k of the quaternions 
are given by identifying i with Y-1 on C^ and j 
with ^ and k= ij.) 


The Penrose Transform 


A basic task of twistor theory is to transform 
solutions to the field equations of mathematical 
physics into objects on twistor space. This works 
well for linear massless fields such as the Weyl 
neutrino equation, Maxwell's equations for electro- 
magnetism and linearized gravity. In its general 
form, this transform has become known as the 
Penrose transform. Such fields correspond to freely 
prescribable holomorphic functions f(A,21,25) (or, 
more precisely, analytic cohomology classes) on 
regions of twistor space. The field can be obtained 
from this function by means of a contour integral. 
The simplest of these integral formulas is 


= $ fos -z + A(x + iy), x — iy 
+ A(t + z))dà 


and differentiation under the integral sign leads 
easily to the fact that ó satisfies the wave equation 
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This formula was originally discovered by Bateman. 
Note that f must have singularities on twistor space 
to yield a nontrivial ó and even then, there are many 
choices of f that yield zero. For a solution ó defined 


over a region U in spacetime, the function f is 
correctly understood as a representative of a Cech 
cohomology class defined on the region U’ in twistor 
space swept out by the lines corresponding to points 
of U. Furthermore, the function f should be taken 
globally to be a function of homogeneity —2, 
f(AZ°) —- A?f(Z^). This formula has generalizations 
to massless fields of all helicities in which a field of 
helicity s corresponds to a function (Cech cocycle) of 
homogeneity degree 2s — 2. 

The Penrose transform has found important 
applications in representation theory and integral 
geometry. For a review, the reader is referred to 
Baston and Eastwood (1989), the relevant survey 
articles in Bailey and Baston (1990), or Mason and 
Hughston (1990, chapter 1). 


Twistor Theory and Nonlinear Equations 


The Penrose transform for the Maxwell equations 
and linearized gravity turns out to be linearizations 
of correspondences for the nonlinear analogs of 
these equations: the Einstein vacuum equations and 
the Yang-Mills equations. However, the construc- 
tions only work when these fields are anti-self-dual. 
This is the condition that the curvature 2-forms 
satisfy F* = —iF, where * denotes the Hodge dual 
(which, up to certain factors of +i, has the effect of 
interchanging electric and magnetic fields); it is a 
nonlinear generalization of the right-handed circular 
polarization condition. Explicitly, in terms of space- 
time indices a, b,... —0,1,2, 3, Fr, = (1/2)eabeaF 4, 
where €09123=1 and Espcd =Ejabed]- In Minkowski 
signature, the i factor in the anti-self-duality condi- 
tion implies that real fields cannot be anti-self-dual. 
Thus, these extensions are not sufficient to fulfill the 
ambitions of twistor theory to incorporate real 
classical nonlinear physics in Minkowski space. 
However, the factor of i is not present in Euclidean 
and ultrahyperbolic signature, so the anti-self- 
duality condition is consistent with real fields in 
these signatures and this is where the main applica- 
tions of these constructions have been. 


The Nonlinear Graviton Construction 
and Its Generalizations 


The first nonlinear twistor construction was due to 
Penrose (1976), and was inspired by Newman's 
(1976) construction of *heavens" from the infinities 
of asymptotically flat spacetimes in general 
relativity. 

The nonlinear graviton construction proceeds 
from the definition of twistors in flat spacetime as 
a-planes in complexified Minkowski space. It is 


Twistor Theory: Some Applications 305 


natural to ask which complexified metrics admit a 
full family of a-surfaces, that is, 2-surfaces that are 
totally null and self-dual. The answer is that a full 
family of a-surfaces exists iff the conformally 
invariant part of the curvature tensor, the Weyl 
tensor, is anti-self-dual. If this is the case, twistor 
space can be defined to be the (necessarily three- 
dimensional) space of such a-surfaces. 

The remarkable fact is that the twistor space, 
together with its complex structure, is sufficient to 
determine the original spacetime. Twistor space is 
again a three-dimensional complex manifold, and 
contains holomorphically embedded rational curves, 
CP's, at least one for each point of the spacetime. 
However, holomorphic rigidity implies that the 
family of rational curves is precisely four- 
dimensional over the complex numbers. Further- 
more, incidence of a pair of curves can be taken to 
imply that the corresponding points in spacetime lie 
on a null geodesic and this yields a conformal 
structure on spacetime. Further structures on twistor 
space can be imposed to give the complex spacetime 
a metric that is vacuum, perhaps with a cosmologi- 
cal constant. The correspondence is stable under 
small deformations and so the data defining the 
twistor space is effectively freely prescribable, see 
Penrose (1976). 

In Euclidean signature, again the complex 
a-planes intersect the real spacetime in a point, so 
the twistor space again fibers over spacetime. The 
twistor fibration can be constructed as the projecti- 
vized bundle of self-dual spinors or more commonly 
as the unit sphere bundle in the space of self-dual 
2-forms (Atiyah et al. 1978). In the latter formula- 
tion, the complex structure on the twistor space 
arises from the direct sum of the naturally defined 
complex structures on the horizontal and vertical 
tangent spaces to the bundle; that on the vertical 
subspace is the standard one on the sphere, and that 
on the horizontal subspace is a multiple of the self- 
dual 2-form at the given point of the fiber. 

There are now large families of extensions, 
generalizations, and reductions of this construction. 
They are all based on the idea of realizing a space 
with a given complexified geometric structure as the 
parameter space of a family of holomorphically 
embedded submanifolds inside a twistor space. In 
general, the most useful of these constructions are 
those in which the “spacetime” is obtained as the 
space of rational curves in a twistor space. This is 
because the equations that are solved on the 
corresponding spacetime can be thought of as a 
completely integrable system in which the integr- 
ability condition for the generalized a-surfaces is 
interpreted as the consistency condition of a Lax 
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pair or more general linear system. For a more 
detailed discussion from this point of view, see 
Mason and Woodhouse (1996, chapter 13). 


The Anti-Self-Dual Yang-Mills Equation 
and Its Twistor Correspondence 


The anti-self-dual Yang-Mills equations extend 
Maxwell's equations for electromagnetism in the 
right-circularly polarized case. They are a family of 
equations that depend on a choice of Lie group G, 
usually taken to be a group of complex matrices; 
Maxwell's equations arise when G = U(1). 

Introduce coordinates x^, a — 0, 1,2,3, on R* with 
metric ds? = dx? - dx? — dx! - dx? (this is a metric of 
ultrahyperbolic signature — Euclidean signature can 
be obtained by choosing the coordinates to be 
complex, but with (x°, —x?) the complex conjugates 
of (x9, x!)). The dependent variables are the compo- 
nents A, of a connection D,=0,—A,, where 
0, =0/0x” and A, = A,(x^) € Lie G, the Lie algebra 
of G. This connection defines a method of differ- 
entiating vector-valued functions s in some repre- 
sentation of G. The freedom in changing bases for 
the vector bundle induce the gauge transformations 
A, — g Ag — g '0,g, g(x) € G on Aa; two connec- 
tions that are related by a gauge transformation are 
deemed to be the same. The self-dual Yang-Mills 
equations are the condition 


[Do, D2] = [D1, D3] = [Do, D3] — [D1, D2] = 0 
They are the compatibility conditions 
[Do + AD4, D2 + AD3] = 0 
for the linear system of equations 
(Do — AD1)s = (D; — AD3)s = 0 [1] 


where AE C and s is an z-component column 
vector. These latter equations form a “Lax pair" 
for the system. 

The Ward (1977) construction provides a one-one 
correspondence between gauge equivalence classes 
of solutions of the self-dual Yang-Mills equations 
and holomorphic vector bundles on regions in 
twistor space. The key point here is that eqn [1] 
defines parallel propagation along a-planes. To each 
point Z in twistor space, we can associate the vector 
space Ez of solutions to eqn [1] along the 
corresponding a-plane. These vector spaces vary 
holomorphically with Z and that is what one means 
by a holomorphic vector bundle E— PT. The 
remarkable fact is that the anti-self-dual Yang- 
Mills field can be reconstructed up to gauge from E, 
and, in effect, for local analytic solutions, E can be 
represented by freely prescribable “patching” data 


consisting of local holomorphic matrix-valued func- 
tions on twistor space. To construct the solution on 
spacetime, one must first find a Birkhoff factoriza- 
tion of the patching data on each Riemann sphere in 
twistor space corresponding to points of the appro- 
priate region in spacetime. On each Riemann sphere, 
the Birkhoff factorization starts with the given 
patching function with values in GL(z, C) on the 
real axis in the complex plane, and expresses it as a 
product of functions with values in GL(z, C) one of 
which extends over the upper-half plane, and the 
other over the lower-half complex plane. The anti- 
self-dual connection can be obtained by differentiat- 
ing the resulting matrices. See Penrose (1984, 1986), 
Ward and Wells (1990), or Mason and Woodhouse 
(1996) for a full discussion, and Atiyah (1979) for 
the formulation appropriate to Euclidean signature. 


Completely Integrable Systems 


In effect, the twistor constructions amount to 
providing a geometric general local solution to the 
anti-self-duality equations; the twistor data is, for a 
local solution, freely prescribable. In this sense, they 
demonstrate complete integrability of the anti-self- 
duality equations. The reconstruction of a solution 
on spacetime from twistor data is not a quadrature — 
it involves, in the anti-self-dual Yang-Mills case, a 
Birkhoff factorization (also sometimes referred to as 
the solution to a Riemann-Hilbert problem), and in 
the case of the anti-self-dual Einstein equations, the 
construction of a family of rational curves inside a 
complex manifold. Nevertheless, such constructions 
are a familiar part of the apparatus of the theory of 
integrable systems. 

In Ward (1985), this connection with integrable 
systems was developed further, and the anti-self- 
dual Yang-Mills equations were shown to yield 
many important integrable systems under symmetry 
reduction. Ward's list has been extended and now 
includes many of the most famous examples of 
integrable systems such as the Painlevé equations, 
the Korteweg-de Vries (KdV) equation, the non- 
linear Schródinger equation, the m-wave equations, 
and so on, see Mason and Woodhouse (1996) for a 
review. There are some notable omissions from the 
list such as the Kadomtsev-Petviashvili (KP) and 
Davey-Stewartson equations (at least if one restricts 
oneself to finite-dimensional gauge groups; reduc- 
tions using infinite dimensional gauge groups have 
been obtained). 

The list of integrable systems obtainable by 
symmetry reduction nevertheless remains impressive 
and provides a route to the classification of at least 
those integrable systems that can be obtained in this 
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way. Such systems can be classified by the choice of 
ingredients required in the symmetry reduction: the 
gauge group, the group of spacetime symmetries to 
be reduced by, the choice of Euclidean or ultra- 
hyperbolic signature, and the choice of certain 
constants of integration that arise in the reduction. 

Another implication is that if an integrable system 
can be obtained from one of the self-duality 
equations by symmetry reduction, then it inherits a 
reduced twistor correspondence because the twistor 
correspondences share the symmetry groups of the 
spacetime field equations. These twistor correspon- 
dences can be seen to underlie much of the theory of 
these equations; for example, Backlund transforma- 
tions of solutions correspond to elementary alge- 
braic operations on the twistor data, similarly the 
Kac-Moody Lie algebras of hidden symmetries act 
locally on the twistor data by matrix multiplication 
of the appropriate loop algebras. Similarly, the 
inverse-scattering transform for the KdV and non- 
linear Schrodinger equations can be seen to arise as 
particular presentations of the twistor construction. 

By and large, although twistor methods have 
yielded new insight into the geometry and structure 
of systems in dimensions 1 and 2, they have not 
necessarily superceded pre-existing techniques for 
constructing solutions and analyzing the solution 
space. The systems for which twistor methods have 
been particularly effective for constructing solutions 
and characterizing their properties are in 2+ 1 or 
higher dimension. Key examples here are of course 
the anti-self-dual Yang-Mills and Einstein equations 
themselves, and their single translation reductions. 
In the anti-self-dual Yang-Mills case, these reduc- 
tions lead either to Ward's or Manakov and 
Zakharov's chiral model in Lorentzian signature, 
2 4- 1, or the Bogomolny equations for monopoles, 
the reduction from Euclidean signature. In both 
cases, the twistor construction has played a major 
role in constructing and studying the solitonic 
solutions. 

See Ward and Wells (1990), Mason and Wood- 
house (1996), Ward's article in Huggett et al. (1998) 
and the first few chapters of Mason et al. (1995), 
and Mason et al. (2001) for more examples of 
aspects of the theory of integrable systems arising 
from twistor correspondences. 


Applications to Geometry 


These applications are, to a large extent, higher- 
dimensional analogs of those discussed above; most 
of the problems in geometry to which twistor theory 
has been applied are those for which the underlying 
differential equations are integrable. These start 


with the Euclidean signature versions of the original 
Ward construction for anti-self-dual Yang-Mills 
fields and Penrose's nonlinear graviton construction 
for Ricci-flat anti-self-dual metrics but, as we will 
discuss, these constructions have a number of 
extensions and generalizations. 

The first dramatic application of these construc- 
tions was the ADHM construction of Yang-Mills 
instantons. These are absolute minima of the Yang- 
Mills action, S[A] — f tr(F ^ F*) on the 4-sphere, st 
with its round metric. A simple argument shows that 
the action is bounded below by the second Chern 
class of the bundle and that this bound is achieved 
only for anti-self-dual fields. Thus, the problem was 
to characterize all the anti-self-dual Yang-Mills 
fields on S^. In this Euclidean context, twistor 
space, CP’, fibers over S4 and the corresponding 
Ward vector bundle is a bundle over all of CP?. It 
turns out that all such bundles satisfying a certain 
stability condition had been constructed reasonably 
explicitly by algebraic geometers. Since the stability 
condition was implied by the context, this could be 
turned into an algebraic construction of the general 
instanton explicit enough to give some insight into 
both the local and global structure of the solution 
space. See Atiyah (1979) for a review. 

Hitchin used the Euclidean version of the non- 
linear graviton to develop the theory of gravitational 
instantons that are asymptotically locally Euclidean 
(ie., asymptotically R^/T, where T is a finite 
subgroup of the rotation group). These were finally 
constructed by Kronheimer who again used twistor 
theory to identify the appropriate parameter space, 
see his article in Mason et al. (2001) and Dancer's 
review of hyper-Káhler manifolds in LeBrun and 
Wang (1999). 

Even in four dimensions, there are a number of 
variants of the nonlinear graviton construction. The 
basic twistor correspondence produces a twistor 
space that is a complex 3-manifold 77 for 
4-manifolds with conformal structures whose Weyl 
tensor is anti-self-dual. There are four natural 
specializations that have attracted study: (1) the 
Ricci-flat case, (2) the Einstein case (with nonzero 
cosmological constant), (3) the scalar-flat Kahler 
case, and (4) the hypercomplex case. 

The twistor space in the Ricci-flat case admits the 
additional structure of a fibration over CP! together 
with a holomorphic Poisson structure on the fibers 
with values in the pullback of the 1-forms on CP! 
(alternatively, the bundle of holomorphic 3-forms 
should be the pullback of the square of the bundle of 
holomorphic 1-forms on CP!). The Einstein case 
with nonzero cosmological constant is a variant of 
this in which the twistor space admits a 
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nondegenerate holomorphic contact structure, that 
is, a distribution of 2-plane elements, which are only 
integrable when the cosmological constant vanishes. 
It also admits a Kahler form when the scalar 
curvature is positive (in the negative case the 
corresponding Kahler form is indefinite). For the 
case of Kahler metrics with vanishing scalar curva- 
ture, the twistor space admits a holomorphic volume 
form with a double pole. The Ricci-flat case is 
equivalent to the case of hyper-Kahler metrics, those 
that are Kahler with respect to three different 
complex structures I,J, and K satisfying the stan- 
dard quaternionic relations I] = K, etc. A hypercom- 
plex structure is obtained when one only has the 
three integrable complex structures satisfying the 
quaternion relations. Such manifolds admit an 
underlying conformal structure that is anti-self- 
dual, and the corresponding twistor space admits a 
fibration to CP!. 

These constructions have all played a significant 
role in the general analysis of these geometric 
structures, and the construction of examples. A 
striking example of an application of the nonlinear 
graviton construction to general properties is due to 
Donaldson and Friedman who show that if two 
4-manifolds admit anti-self-dual conformal struc- 
tures, then their direct sum does also. 

In higher dimensions, most generalizations rely on 
quaternionic geometry and its reductions. The 
Euclidean signature formulation of the nonlinear 
graviton construction has natural extensions to 
quaternionic manifolds in 4k dimensions. These are 
manifolds with metric whose holonomies are con- 
tained in Sp(k) x Sp(1). The latter SP(1) — SU(2) 
factor leads to an associated S? bundle whose total 
space is the twistor space PT and it naturally has 
the structure of a (2k + 1)-dimensional complex 
manifold. 

For a series of review articles, the reader is 
referred to Bailey and Baston (1990, chapters 3 
and 4) and also LeBrun and Wang (1999, chapters 
2, 5, 6, 10, and 14) which, despite being a book on 
the distinct subject of Einstein manifolds, is strongly 
influenced by twistor theory. Other applications 
along these lines are summarized in Mason et al. 
(2001, chapter 1). 

There are a number of applications that go 
beyond complete integrability. A striking application 
is the twistor framework of Merkulov for studying 
arbitrary geometric structures. This has led to a 
classification of all possible irreducible holonomies 
of torsion-free affine connections, see Merkulov's 
article in Huggett et al. (1998). Another important 
area is in the field of conformal invariants in which 
the local twistor connection plays a prominent role. 


This is a connection that is naturally defined on any 
conformal manifold being the spinor representation 
of the Cartan conformal connection. An impressive 
application here is the construction of conformally 
invariant differential operators and other conformal 
invariants. See the article by Baston and Eastwood 
in Bailey and Baston (1990). 


Beyond Classical Integrability: 
Twistor-String Theory 


Until Witten (2004), there was little indication that 
twistor theory would have much useful to say about 
Yang-Mills or gravitational fields that are not anti- 
self-dual. Furthermore, it was problematic to incor- 
porate quantum field theory into twistor ideas. 
However, twistor-string theory has transformed the 
situation and has furthermore had impressive appli- 
cations to the field of perturbative gauge theory. 

The story starts with a formulation by Nair of the 
remarkable Park-Taylor formulas for the so-called 
maximal helicity violating (MHV) amplitudes in 
gauge theory. These are scattering amplitudes at tree 
level in which helicity conservation is maximally 
violated; using crossing symmetry to take all the 
particles to be outgoing, these are amplitudes in 
which n — 2 of the particles have helicity —1 and two 
have helicity +1. These amplitudes can be expressed 
simply as follows. Let the n particles have color t; in 
the Lie algebra of the gauge group and null 
momenta p; with spinor decompositions p? — tA, 
¡=1,...,n where the m are self-dual spinors and 
1^ are anti-self-dual spinors using the index notation 
of Spinors and Spin Coefficients, and Twistors. Let 
i—r and i— s be the two gluons of helicity +1. Then 
the coefficient of the colour term tr(t;t> ---t,,) is 
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where Ti TT Tp denotes the standard skew- 
symmetric inner product on chiral spinors and 
7441 71. A striking feature is that, except for the 
delta function, it is holomorphic in the 7;s except at 
the simple poles 7;-7;,; =0. Nair interprets these 
poles as those associated to fermion correlators in a 
current algebra on a CP! parametrized by 7. Using a 
supersymmetric formulation adapted to N — 4 super 
Yang-Mills, he formulated the amplitude as arising 
from an integral over lines in supertwistor space 
Cp?^. 

Witten extends these ideas to give, at least 
conjecturally, a complete theory. He proposes that 
full perturbative N —4 super Yang-Mills theory on 
spacetime is equivalent to a string theory, a topological 


B model, on a supersymmetric version of twistor 
space, PT, — CP?^, This is the space obtained by 
taking C^* with bosonic coordinates Z^, o — 0, ...,3 
and fermionic coordinates 7/,;— 1,...,4 moduli the 
equivalence relation (Z°,7') ~ A(Z^,:/) ^ where 
A€C,AZzO. 

The number 4 here plays two crucial but different 
roles. It is the maximum number of supersymmetries 
that Yang-Mills can have; it has the effect of 
incorporating both the positive and negative helicity 
parts of the gauge field in the same supermultiplet. It 
is also the only value of N for which CP? is a 
Calabi-Yau manifold and this is a necessary condi- 
tion for the topological twisted B model to be 
anomaly-free. The Calabi-Yau condition is the 
condition that the manifold admit a global holo- 
morphic volume form which here is 


Q, = €a9Z dZ" A dZ" ^. dZ? 
^ dn! ^ dri. ^ dif ^ dnf 


This is invariant under (Z^, ni) — (AZ^, An’) because 
d(An') = Ad, A € C follows from the Berezinian 
rule of integration f0d9=1 for anticommuting 
variables. 

Open-string topological twisted B models are 
known to correspond to holomorphic Chern-Simons 
theories on their target space. A holomorphic Chern- 
Simons theory is a theory whose basic variable is a 
d-bar operator 04=0+A on a complex vector 
bundle E — P'T?*, where A is a Lie algebra valued 
(0, 1)-form on the target space and whose action is 


S[A] = E (ADA + 54) AQ, 


The field equations are 07 = 0. The classical solutions 
therefore consist of holomorphic vector bundles on 

x 3/4 1 ~ 
the target space, here CP"". The twistor-space 
representation of the fields are obtained by expanding 
A in the anticommuting variables 7' to obtain 


A —a + vl bi vl rl e + vh nf di 
+n n ng 

and a has homogeneity zero, but because the 
homogeneity of r/ is of degree 1, b; has homogeneity 
degree —1, and so on down to homogeneity degree 
—4 for g. Via the Ward construction, the a 
component corresponds to an anti-self-dual Yang- 
Mills field on spacetime. The other components of A 
can be seen to correspond to spacetime fields with 
helicities —1/2 to + that are background coupled to 
the anti-self-dual Yang-Mills field. 
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As it stands, although this holomorphic Chern- 
Simons theory gives the correct field content of 
N=4 super Yang-Mills, the couplings are only 
those of an anti-self-dual sector and more couplings 
are needed to obtain full N=4 super Yang-Mills. 
The remarkable fact is that these can be naturally 
introduced by coupling in certain D1 instantons. 
The D1 instantons are algebraic curves C in twistor 
space and the coupling is via a pair of spinor fields a 
and 3 on C with values in E and E*, respectively 
with action 


So. 3. A] = | Bsa 
C 


This leads to explicit expressions for Yang-Mills 
scattering amplitudes in terms of integrals of 
fermion correlators over the moduli spaces of such 
algebraic curves in supertwistor space. In principle, 
the integral is over all algebraic curves. However, 
algebraic curves have two topological invariants, 
their degree denoted d and genus g. An argument 
based on a classical scaling symmetry gives that 
integration over just those of curves of degree d 
gives the subset of processes for which 


d—q-—1-l 


where q is the number of outgoing particles of 
helicity +1 in the process and / is the number of 
loops. It is also the case that g < /. 

An elegant formula for the amplitudes is that for 
the on-shell generating functional for tree-level 
scattering amplitudes .[.A], where A is the on- 
shell twistor field, being the above-mentioned (0, 1)- 
form. The generating functional for processes with 
q — d + 1 external fields of helicity +1 is then 


ALA] = / det(5 + A)| dy 
Ce #4 


where dy is a natural measure on the moduli space 
/* of connected rational (genus 0) curves in CP** 
of degree d. This approach has been successfully 
exploited to obtain implicit algebraic formulas for 
all tree-level scattering amplitudes. 

In an alternative version, the curves of degree d 
can be taken to be maximally disconnected, being 
the union of d lines. However, in this approach, we 
need to also incorporate Chern-Simons propagators 
which, for tree diagrams, join the lines into a tree. 
This gives a very flexible calculus for perturbative 
gauge theory in which scattering processes are 
obtained by gluing together MHV diagrams. It has 
been argued that the two formulations are equiva- 
lent. On the one hand, the Chern-Simons propaga- 
tor has a simple pole when the lines meet and the 
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contour integral over the moduli space can be 
performed using residues in such a way as to 
eliminate the Chern-Simons propagators leaving an 
integral over d intersecting lines. On the other hand, 
the measure on the space of connected curves has a 
simple pole where the curve acquires double points 
and again the contour integral can be performed in 
such a way as to yield the same integral over d 
intersecting lines. 

It should be mentioned that Berkovits has given an 
alternative version of twistor-string theory which is a 
heterotic open-string theory with target supertwistor 
space in which the strings are taken to have boundary 
on the real slice RP? in CP? (this is appropriate to a 
spacetime with split signature) and the D1-instanton 
expansions are replaced by expansions in the funda- 
mental modes of the string (this is not a topological 
theory). This gives rise to the same formulas for 
scattering amplitudes as Witten's original model. 

There have been many applications now of these 
ideas, perhaps the most striking being the recursion 
relations of Britto, Cachazo, Feng, and Witten 
which give, at tree level, on-shell recurrence rela- 
tions for Yang-Mills scattering amplitudes that 
suggests a hitherto unsuspected underlying structure 
for Yang-Mills theory. 

Despite all these successes, twistor-string theory is 
not thought by string theorists to be a good vehicle for 
basic physics. The most serious problem is that the 
closed-string sector gives rise to conformal supergravity 
which is an unphysical theory. This is particularly 
pernicious from the point of view of analyzing loop 
diagrams as from the point of view of string theory, 
loop diagrams will carry supergravity modes. From this 
point of view, twistor-string theory is another duality, 
like AdS-CFT etc., that gives insight into some standard 
physics but is fundamentally limited. 

From the point of view of a twistor theorist, 
however, twistor-string theory has overcome major 
obstacles to the twistor programme. Hodges has 
used the BCFW recursion relations to provide all 
twistor diagrams for gauge theory. In Mason (2005) 
it is shown how to derive the main generating 
function formulas from Yang-Mills and conformal 
gravity spacetime action principles via a twistor 
space actions for these theories. These twistor 
actions can in the first instance be expressed purely 
bosonically and distinctly and the twistor-string 
generating function formulas are obtained by 
expanding and re-summing the classical limit of the 
path integral in a parameter that expands about the 
anti-self-dual sector. This allows one to decouple the 
Yang-Mills and conformal gravity modes, and 
indeed to work purely bosonically — one is not tied 
to super Yang-Mills. Although there is much work 


to be done to extend these ideas to provide a 
consistent approach to the main equations of basic 
physics, obstacles that seemed insurmountable a few 
years ago have been overcome. 


See also: Chern-Simons Models: Rigorous Results; 
Einstein Equations: Exact Solutions; General Relativity: 
Overview; Instantons: Topological Aspects; Integrable 
Systems and the Inverse Scattering Method; Riemann- 
Hilbert Methods in Integrable Systems; Spinors and Spin 
Coefficients; Twistors; Classical Groups and 
Homogeneous Spaces; Quantum Mechanics: 
Foundations; Several Complex Variables: Compact 
Manifolds; Several Complex Variables: Basic Geometric 
Theory. 
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Introduction 


Twistor theory initially arose from two principal 
motivations: a desire for a conformally invariant 
calculus for spacetime geometry and fields on 
spacetime, and a desire to unify and account for 
the various occurrences of complex numbers and 
holomorphic functions in mathematical physics, 
especially in general relativity (Penrose and 
MacCallum 1973). The theory leads to a nonlocal 
relation between spacetime and twistor space, 
whereby a point in one is an extended object in 
the other. Part of the present-day motivation of the 
subject is that this nonlocal relation will be a 
fruitful way to approach the quantization of 
spacetime. A comparison is often invoked with 
Hamiltonian mechanics, which is a formal rephras- 
ing of classical mechanics that nonetheless provides 
a bridge from that theory to quantum mechanics. 
The hope is that the twistor theory has the right 
character to provide a bridge from general relativ- 
ity to quantum theory, specifically to quantum 
gravity. ! 

The principal successes of twistor theory in 
mathematical physics can be characterized as 
the linear Penrose transform, which provides a 
solution of the zero-rest-mass free-field equations 
in Minkowski space in terms of sheaf cohomology in 
twistor space, and the nonlinear Penrose transform, 
which provides solutions of certain nonlinear field 
equations in terms of holomorphic geometry. These 
are treated below, together with other applications 
of twistor theory, following a brief introduction to 
twistor geometry. 

Very recently, there has been a resurgence of interest 
in twistor theory following Witten's introduction of 
twistor string theory (Witten 2003) as a string theory 
in twistor space. This is not treated here, but this 
article does provide the necessary background. 
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Ward RS and Wells RO (1990) Twistor Geometry and Field 
Theory. Cambridge: Cambridge University Press. 

Witten E (2004) Perturbative gauge theory as a string theory in 
twistor space. Communications in Mathematical Physics 252: 
189 (arXiv:hep-th/0312171). 


Twistor Geometry 


General references for this section are the books by 
Penrose and Rindler (1986) and Hugget and Tod 
(1994). It will be convenient to use Penrose's 
abstract index convention (Penrose and Rindler 
1984, 1986), which is also used in Spinors and 
Spin Coefficients. This can be used wherever vector 
or tensor indices occur. Suppose that V is a (real or 
complex) finite-dimensional vector space with dual 
V'. Elements of V are written v^ u^ 104, ..., where an 
index a, b, c,... is regarded not as an integer in the 
range 1 to dim V but simply as an abstract label 
indicating that the object to which it is attached is a 
vector. Elements of V’ are similarly written 
H4, Up, Wc... and elements of the tensor algebra as 
(^^. , according to valence, and so on. The usual 
operations of tensor algebra are written in the way 
that component calculations would suggest, but 
without necessitating a choice of basis. The jump 
to tensor fields on a manifold M is immediate. A 
metric is a particular field g,, and determines a Levi- 
Civita connection V, which defines maps V,:v^ > 
V,v^ and similar for other valences. The virtue of 
the formalism is that, while remaining invariant, it 
can harness the strength and flexibility of calcula- 
tions in components. 

With this understanding, twistors may first be 
defined as the fundamental representation of 
SU(2,2), so that they are elements Z^ of a four- 
dimensional complex vector space T. T carries a 
Hermitian form Y of signature (+ + — —) which is 
made explicit below and which provides an isomorph- 
ism from the complex conjugate of T to its dual. This 
isomorphism is used to eliminate all appearances of 
complex-conjugate twistors from the formalism and is 
therefore regarded as an antilinear map to the dual. 

SU(2, 2) is the double cover of O(2, 4), the rotation 
group of E24, the six-dimensional space with flat 
metric 72,4 of signature (+ — — — + —), which in turn is 
the double cover of C(1,3), the conformal group of 
Minkowski space M. This last group homomorphism 
may be made explicit as follows (suspending the 
abstract-index convention for the duration of this 
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aside): introduce — pseudo-Cartesian coordinates 
Sf i a 3 
x= (x ataa on M and y*-(y9,yl, y,y, 


yt, y?) on E24. The corresponding metrics are 
MLS = Map dx^dx^ 
= (dx?) — (det)? — (dx^)^ — (da? — [1] 


124 = Naady“dy? 
= (dy®)* — (dy!)* — (dy?) — (dy?) 
+ (dy*)* — (d^) [2] 
We map M into E»4 by 
p(x") = (x9, x' x^ x^, (1—m)/2,(1+m/2) [3] 


where 1 — g,x^x" with na, as in [1], and it can be 
checked that (M) is the intersection of the null cone 
N of the origin in E»4 with the plane P defined by 
y* -- y? «1. P is in fact a null hyperplane in E24 
and any point of N not on the null hyperplane 
defined by 


y! y =0 [4] 


can be mapped along the generators of N to a 
unique point of P (recall that any point on a cone 
lies on a line through the vertex: these lines are the 
generators). Thus, the image of M under @ gives a 
point on every generator of N except those satisfying 
[4]. It can also be seen from [2] that the intrinsic 
metric in E24 on the intersection of N and P is 
just 71,3. 

Now let PN be the projective null cone, or, 
equivalently, the space of generators of N. This is a 
compact manifold with topology S! x S?, as one can 
see by intersecting N with the sphere 


Each generator meets this sphere twice at, say, y^ 
and —y^, and PN is the quotient by this identifica- 
tion of the two surfaces 


o^Y - ofy 21-2 (Y - o^Y +0° + 0 
which define the intersection. The metric 7 4 defines 
a degenerate metric on N, which, however, is 
nondegenerate on any smooth cross section of N 
which meets each generator once. Furthermore, the 
map along the generators between any two such 
cross sections is conformal. Thus, there is a 
conformal metric on PN and it is conformal to 
11.3. We call PN compactified Minkowski space Me 
as it is compact and has the same conformal metric 
as Minkowski space. It can be thought of as M 
compactified by the addition of some points, namely 


the points of PN corresponding to the generators 
satisfying [4]. To interpret these, we consider the 
points satisfying the similar equation y* — y? — 0. By 
inspection of $, [3], we see that these points 
correspond to the light cone of the origin in M. 
Thus, Me is obtained from M by adding a single 
light cone, the light cone at infinity known as Z and 
read as "scri," short for “script-I.” 

Now the rotation group O(2, 4) of E» 4 maps N to 
itself preserving the metric and consequently maps 
DN to itself, preserving the conformal metric. Thus, 
O(2,4) defines conformal transformations of M, 
and a count of dimension shows that it is locally 
isomorphic to the conformal group C(1, 3). The map 
is two-to-one with +I in O(2,4) maping to I in 
C(1,3). The fact that SU(2, 2) is four-to-one homo- 
morphic to C(1,3) follows from calculations below. 
It is because of this homomorphism of SU(2, 2) and 
C(1,3) that the geometry and analysis of twistors 
(i.e., twistor theory) provides a formalism adapted 
to conformally invariant or conformally covariant 
notions in M or M,. 

A twistor may be expressed in terms of two- 
component spinors of SL(2, C), the double cover of 
the Lorentz group, as follows: 


Z? = (w^, mw) [5] 
where again indices are abstract, so that 
T-So8 


in terms of the spin space 5 and complex-conjugate 
dual spin space S' of M. Now we can write 
the action of infinitesimal elements of C(1,3) 


explicitly as 


o^ = de gu. P - [r^^ TA + Au^ 


T A! = ou TB! T ¡Bart + Ama 


[6] 


where T^^ (a real vector) defines an infinitesimal 
translation, Baa (another real vector) defines an 
infinitesimal special conformal transformation, A (a 
real constant) defines a dilatation and the (real) 
bivector Map = óAp€w p + Oy pap. defines an infini- 
tesimal rotation. This gives a total of 15 parameters 
for the transformation, which is the correct dimen- 
sion for C(1, 3). 
The Hermitian form X(,) can be written as 


D(Z, Z) = Z9?Z, = W tty o^ ny [7] 


when it can be checked that the transformations [6] 
leave it invariant (and that its signature is (+ + — —); 
this establishes that SU(2, 2) is locally isomorphic to 
C(1,3)). Equation [7] will be referred to as the norm 
of a twistor. 
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From [6], a twistor Z^ = (w^ 


, Tar) gives rise, under 
translation by a variable x^^, to a spinor field Q^ 
given by 


Q^ = u^ — ix^^ v [8] 


Differentiating [8] and symmetrizing, we see that 04 
satisfies the differential equation 


V (AO) — 0 [9] 


which is known as the twistor equation. In fact, the 
general solution of [9] takes the form of [8] for 
constant spinors w^ and Try. Furthermore, the 
conformal group can be shown directly to act on 
solutions of [9], so that twistor theory can begin 
with the study of [9] and its solutions. In this 
approach, a twistor is precisely a solution of [9]. 

Given a spinor field O^ of the form of [8], we may 
seek the points of M where it vanishes. In general, there 
are none, but if we consider complexified Minkowski 
space CM, then Q^ vanishes on a two-dimensional 
complex plane with the property that every tangent 
vector is of the form A^z^ for varying A^ and fixed 
1^. The 2-plane is flat and totally null, in that the 
(analytically extended) Minkowski metric vanishes 
identically on it, and it has a self-dual (SD) tangent 
bivector determined by z^. Such a 2-plane is known as 
an a-plane (reserving the term 5-plane for a totally null 
2-plane with anti-self-dual (ASD) tangent bivector). At 
a given point p in CM, there is an a-plane for each 
choice of my up to scale (in other words, for each 
element of the projective (primed) spin space at p) 
which is a copy of the complex projective line, CP". 

The a-plane is determined by the twistor up to 
scale (in that a constant complex multiple of the 
field Q^ determines the same a-plane). Thus, we 
consider the projective twistor space PT which, 
since T is C*, is a copy of complex projective 
3-space, CP”. This is now the space of a-planes, but 
is also compact. We define complexified, compacti- 
fied Minkowski space CM, as the space of all 
(complex projective) lines in PT; then it is easy to 
see that this includes CM as an open dense subset. 
PT is the space of a-planes in CM, and two lines 
meet in PT iff the corresponding points in CM, lie 
on an a-plane, or, equivalently, iff they are null 
separated. Thus, the conformal structure in CM, is 
determined by incidence of lines in PT. 

To find M and M, in this picture, we seek a-planes 
containing real points. If 04 from [8] vanishes at a 
real x^^, then the contraction w^z4 must be purely 
imaginary, so that, by [7], the norm of the twistor is 
zero. Conversely, one calculates that Q^ can indeed 
vanish at real points if the norm is zero, and that it 
will then in fact vanish along a null geodesic with 


tangent vector (proportional to) #47“. Twistors with 
norm zero are called null and the (five-dimensional, 
real) submanifold of them in PT is PN. This is a 
compactification of the space of (unscaled) null 
geodesics in M by the inclusion of the 2-sphere of 
null geodesics in M, which lie on the light cone at Z. 
For use in the next section, we note the definition of 
PT* and PT” as the projective twistors with positive 
and negative norm, respectively. 

To summarize, we have found M and Me: 
(complex projective) lines in PT define points of 
CMa lines in PN define points of M, with one such, 
call it I, picked out as the vertex of the null cone Z; 
lines in PN which meet I correspond to points of T; 
lines in PN which do not meet I correspond to 
points in M. As for CM,, the conformal structure of 
M and M, is determined by incidence in PN. We 
may now note the nonlocal correspondence men- 
tioned in the introduction: points in CM, are lines in 
PT and points in PT are a-planes in CM.. 

It will be convenient to refer to the line in PT 
associated with a point x in CM, as Ly. With this 
notation, it is possible to characterize the forward or 
future tube in terms of twistor space: a point x of 
CM is in the forward tube iff its imaginary part is 
timelike and past-pointing, and this is equivalent to 
Lx lying in PT. 

The starting point for Riemannian twistor theory is 
the fact that CP? is a fibration with fiber CP! over 
S^ where the fiber above a point p can be interpreted 
as the almost-complex structures at p (since this is the 
same as the projective primed spin space at p). In the 
picture developed above, this means that there is an 
S*s worth of lines filling out CP*, no two of which 
intersect (so that there are no null vectors and the 
metric is definite). The complexification of S4 with its 
conformal structure is again CMe. 

If a twistor has nonzero norm, say Z9Z,-5s3 0, 
then it can be interpreted as a massless particle with 
spin s: the momentum is p,-— 747. and the 
angular momentum bivector is M^^ — i (^g e^ — 
ip^ 7B) 48) The angular momentum transforms 
appropriately under translation by virtue of [6] 
and the (Pauli-Lubanski) spin vector is spa, as it 
should be for a massless spinning particle. 


The Linear Penrose Transform: 
Zero-Rest-Mass Free Fields 


A zero-rest-mass free field of spin s is a symmetric 
spinor field ó4p. c with 2s indices which satisfies the 
field equation 


V^^ $5. c = 0 [10] 
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The Weyl neutrino equation, source-free Maxwell 
equation, and linearized Einstein vacuum equation 
are examples of zero-rest-mass free-field equations, 
with spins 1/2, 1, and 2, respectively, so that these 
are equations of physical interest. Conventionally, 
one takes the s = O0 case to be the wave equation, and 
the complex-conjugate fields v4 p.c to have the 
same spin but opposite helicity. 

The conformal group acts on solutions of [10], so 
that the equations are conformally invariant. The 
equations can be solved by contour integral expres- 
sions involving homogeneous functions of a twistor 
variable. To be explicit, we define an operation py of 
restriction to the line L, for a function of a twistor 
variable by the following: 


pf (Z^) = f (ix^* mu, mw) [11] 


Now suppose that f(Z^) is holomorphic and homo- 
geneous of degree —2s — 2 in the twistor variable for 
positive integer 2s, but otherwise arbitrary, and 
consider the integral 


Vip. cr (x) 


= J MAIN BE «e: ro pxf (ZEE rp drp [12] 


where there are 2s indices on v» and the integration 
is around a contour in the line L, in PT. The choice 
of homogeneity ensures that the integral is well 
defined but, to obtain a nonzero answer, p,f must 
have some singularities as a function of 74 on Ly. 
The answer then automatically gives a helicity-(— s) 
solution of [10], as may be checked by differentia- 
tion under the integral sign. 

For a helicity-s solution, we take an arbitrary 
function f(Z^), holomorphic and of homogeneity 
(2s — 2), and consider the integral 


ÓAB..c(x) 


o O o 7 
lizar aee f)" mear [13] 


where there are 2s indices on ó and the integration is 
again around a contour in the line L,. As before, 
one needs singularities to make the contour integral 
nonzero, but again the result satisfies [10]. 

The correct framework in which to understand 
these integrals is sheaf cohomology theory. For 
[12], the functions with singularities are actually 
elements of H'(U, O(— 2s — 2)), the first cohomol- 
ogy group of a region 4 in PT with coefficients in 
the sheaf of germs of holomorphic functions of 
homogeneity —2s — 2, while the fields are elements 
of H°(U, Z,), the zeroth cohomology group of the 
corresponding region U of M with coefficients in 


helicity-s zero-rest-mass fields (thus, (4 must con- 
tain the neighborhood of lines Ly for points x in U). 
Similarly, [13] is interpreted cohomologically in 
terms of potentials modulo a gauge. With appro- 
priate conditions on 4 and U (for brevity, U is said 
to be elementary), these groups can be shown to be 
isomorphic and this isomorphism is known as the 
Penrose transform (Ward and Wells (1991)). A 
particular instance of an elementary U is the 
forward tube, when 4 is PT”. Since the definition 
of positive frequency is holomorphicity on the 
forward tube, this observation geometrizes the 
notion of positive frequency in terms of twistor 
space. 

For free fields with mass, there are generalizations 
of [12] and [13] to solve the Dirac equation for 
different spins. However, the integrands now 
involve functions of more than one twistor variable, 
subject to an equation. This equation is a counter- 
part of the Klein-Gordon equation and breaks the 
conformal invariance (as it must, since mass does). It 
can be imposed by a projection which can in turn be 
written as a contour integral over arbitrary holo- 
morphic functions. It has been argued that the 
appropriate description of leptons and hadrons in 
twistor theory is with functions of two and three 
twistor variables, respectively. Such a function has 
two or three integer quantum numbers determined 
by the homogeneities in different variables, and this 
leads to a twistor particle classification scheme (see, 
e.g., Hughston and Sheppard (1980) and Sparling 
(1981), similar in many respects to, but not 
identical with, the standard classifications. 

Given that free fields, massive or massless, are 
determined from arbitrary twistor functions through 
contour integrals, one may translate the Feynman 
diagrams of a quantum field theory into contour 
integrals over twistor functions. In the massless case, 
the contours are compact, so that the integrals are 
finite without need for renormalization. The massive 
case is more complicated but essentially parallel. 
This is twistor diagam theory and there is a 
substantial literature on it (see, e.g., the article by 
Hodges in the volume edited by Huggett et al. 
(1998)). There is currently no new physical theory, 
distinct from a known quantum field theory, to 
generate the relevant diagrams. 


The Nonlinear Penrose Transform: 
Curved Twistor Spaces 


The electromagnetic field, in Minkowski space say, 
can be regarded as a spinor field subject to field 
equations, in which case these equations can be 


solved via the Penrose transform by contour 
integrals. Alternatively, it can be seen as the 
curvature of a connection on a U(1) bundle over 
M, which is a more active role for the field in 
curving a bundle. For SD or ASD electromagnetic 
fields, there are analogous active twistor construc- 
tions. From an ASD electromagnetic field, one may 
define a connection on the primed spin space of CM 
which is flat on a-planes: if the tangents to the a- 
plane are of the form AM for varying A^ and with 
n^ fixed up to scale, then consider the propagation 
of Try around the a-plane given by 


1^ (Vara — iAara)tp = 0 [14] 


where Aya is a potential for the electromagnetic 
field. This connection is flat provided 


1^ a? Vaw An, = 0 [15] 


and if this is to hold for all ma then Vas Aj, 
vanishes and the electromagnetic field, defined as 
usual as the exterior derivative of the potential, is 
necessarily ASD. Now the space of a-planes in CM 
is projective twistor space PT, so we define a 
holomorphic C* bundle 7 over PT by taking the 
fiber above an a-plane to be choices of ma scaled as 
in [14]. If we restrict attention to the a-planes 
through a given point p of CM, then by comparing 
the scalings at p we can trivialize the bundle; thus, 7 
is trivial on lines in P'T. There is a converse to this 
construction and we have: there is a one-to-one 
correspondence between holomorphic C* bundles 
on a region U in PT which are trivial on lines and 
ASD electromagnetic fields on the corresponding 
region U of CM (for elementary U). 

This construction can be extended to solve the 
ASD Yang-Mills equations with holomorphic vector 
bundles replacing holomorphic line bundles: with U 
and elementary U as above, there is a natural one-to- 
one correspondence between ASD GL(n,C) gauge 
fields on M and holomorphic rank-n vector bundles 
€ over U which are trivial on L, for every x in U. 

ASD Yang-Mills fields cannot be real on M, but 
using Riemannian twistor theory, one can impose 
appropriate reality and globality conditions to 
ensure that these ASD Yang-Mills fields are both 
real and globally defined on S*. These are then 
instantons. The — Atiyah-Drinfeld-Hitchin-Manin 
(ADHM ) construction of instantons (Atiyah et al. 
1978) proceeds via construction of the correspond- 
ing holomorphic vector bundles over twistor space. 

The construction of ASD Yang-Mills fields is 
also the starting point for the twistor theory of 
integrable systems (Mason and Woodhouse 1996), 
following the observation that many of the known 
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completely integrable partial differential equations 
(PDEs) (including the sine-Gordon, Korteweg-de 
Vries (KdV) and nonlinear Schródinger equations) 
are reductions of the ASD Yang-Mills equations. 
Solutions of these other integrable systems can be 
given in terms of a geometrical construction, 
usually of some structure in holomorphic geometry. 

The other major active twistor construction, 
which historically preceded the Yang-Mills one, is 
Penrose's nonlinear graviton (Penrose 1976), which 
solves the ASD Einstein vacuum equations. For this, 
one starts from a complex, four-dimensional mani- 
fold M with holomorphic metric, vanishing Ricci 
curvature and ASD Weyl tensor. These conditions 
on the curvature are necessary and sufficient to 
allow the existence of a-surfaces, which generalize 
a-planes. They are two-dimensional totally null 
(complex) surfaces with SD tangent bivector, one 
for each choice of (null) SD bivector, or, equiva- 
lently, for each choice of primed spinor, at each 
point. 

The space of a-surfaces is a three-dimensional 
complex manifold, the curved twistor space P7. 
This is curved inasmuch as it is not now (part of) 
CP?, but it still contains complex projective lines: 
given a point p in M there is an a-surface through p 
for every primed spinor at p up to scale; these a- 
surfaces make up a projective line L, in PT. The 
conditions on the curvature are equivalent to the 
statement that the Levi-Civita connection is flat on 
primed spinors, so that there exist constant primed 
spinors in M, and the tangent bivector to an a- 
surface can be taken to be constant, without loss of 
generality. The map associating a constant primed 
spinor with each a-surface defines a projection 7 
from PT to CP!, so that PT is a fibration over 
CP’. The lines Lp define a four-parameter family of 
sections of this fibration. 

To define the metric of M from P7, one needs 
the notion of normal bundle: the normal bundle of a 
submanifold Y in a manifold X is N=TX|y/TY in 
terms of the tangent bundles TX and TY. The 
normal bundle M, of a particular section Lp is the 
same in PT as it was in PT, namely H @ H, where 
H is the hyperplane-section line bundle over CP’ 
(Ward and Wells 1991). A section Sy of Np 
corresponds to a vector V in T,M (think of it as 
an infinitesimally neighboring point in M) and V is 
defined to be null iff Sy has a zero. Because of the 
nature of V, this defines a quadratic conformal 
metric, which, furthermore, agrees with the con- 
formal metric on .M and generalizes the definition of 
conformal metric for CM, in terms of incidence in 
PT. To define the actual metric, as opposed to just 
the conformal metric, one has a covariant-constant 
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choice of €^? in M which defines an e on the base of 
the fibration, and a Poisson structure on the fibers y 
of the projection. The definition of y is more intricate, 
but the two structures enable the metric of M to be 
recovered from P7. Penrose (1976) and Huggett and 
Tod (1994) provide more details. 

Now the metric and curvature properties of M 
are coded into holomorphic properties of PT 
together with e and jj. These properties characterize 
Mt: subject to topological conditions on M, there is 
a one-to-one correspondence between bolomorpbic 
solutions M of the Einstein vacuum equations with 
ASD Weyl tensor and three-dimensional complex 
manifolds PT fibered over CP', with a four- 
parameter of sections, each with normal bundle 
H @ H, and the forms e and y as above. 

In fact, one only needs to assume the existence of 
one section with the correct normal bundle and the 
full four-parameter family will automatically exist, 
at least near to the initial one. Penrose (1976) 
showed how curved twistor spaces with the neces- 
sary structures could be obtained by deforming the 
neighborhood of a line in the “flat” twistor space 
P'T. The Kodaira-Spencer theory of complex defor- 
mations ensures that the necessary lines continue to 
exist under this deformation. 

The original nonlinear graviton construction has 
been extended in various ways including the follow- 
ing: to allow the possibility of a cosmological 
constant (Ward and Wells 1991); to produce real, 
Riemannian solutions (Hitchin 1995); to solve other 
but related field equations (e.g., those for hyper- 
complex metrics, scalar-flat Kahler metrics or 
Einstein- Weyl structures). 

The search for a twistor construction of the 
SD Einstein equations (distinct from a construction 
in terms of dual twistors, which is, of course, 
provided by deforming dual twistor space) is an 
active area of research. This and other applications of 
twistor theory, including a quasilocal definition of 
mass in general relativity, the classification of affine 
holonomies and the construction of four-dimensional 
conformal field theories, may be found in the 
literature cited in the “Further reading” section. 


See also: Classical Groups and Homogeneous Spaces; 
Clifford Algebras and Their Representations; Integrable 


Systems: Overview; Quantum Field Theory: A Brief 
Introduction; Quantum Mechanics: Foundations; 
Relativistic Wave Equations Including Higher Spin Fields; 
Riemann-Hilbert Problem; Spinors and Spin Coefficients; 
Twistor Theory: Some Applications. 
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Introduction 


For the last twenty years or so, two-dimensional 
(2D) conformal field theories have played an 
important role in different areas of modern theo- 
retical physics. One of the main applications of 
conformal field theory has been in string theory (see 
Compactification of Superstring Theory), where the 
excitations of the string are described, from the 
point of view of the world sheet, by a 2D conformal 
field theory. Conformal field theories have also been 
studied in the context of statistical physics, since the 
critical points of second-order phase transition are 
typically described by a conformal field theory. 
Finally, conformal field theories are interesting 
solvable toy models of genuinely interacting quan- 
tum field theories. 

From an abstract point of view, conformal field 
theories are (Euclidean) quantum field theories 
that are characterized by the property that their 
symmetry group contains, in addition to the 
Euclidean symmetries, local conformal transforma- 
tions, that is, transformations that preserve angles 
but not necessarily lengths. The local conformal 
symmetry is of special importance in two dimen- 
sions since the corresponding symmetry algebra is 
infinite dimensional in this case. As a consequence, 
2D conformal field theories have an infinite 
number of conserved quantities, and are essentially 
solvable by symmetry considerations alone. The 
mathematical formulation of these symmetries has 
led to the concept of a vertex operator algebra, 
which has become a new branch of mathematics in 
its own right. In particular, it has played a major 
role in the explanation of *monstrous moonshine" 
for which Richard Borcherds received the Fields 
medal in 1998. 

In the following, we want to explain the main 
features of conformal field theory using an algebraic 
approach that will naturally lead to the concept of a 
vertex operator algebra. There are other approaches 
to the subject, most notably the formulation, 
pioneered by Segal, of conformal field theory as a 
functor from the category of Riemann surfaces to 
the category of vector spaces. Due to limitations of 
space, however, we will not be able to discuss any of 
these other approaches here. 


The Conformal Symmetry Group 


The conformal symmetry group of the z-dimen- 
sional Euclidean space R” consists of the (locally 
defined) transformations that preserve angles but 
not necessarily lengths. The transformations that 
preserve angles as well as lengths are the well- 
known translations and rotations. The conformal 
group contains (in any dimension) in addition the 
dilatations or scale transformations 


x! XM = Ax! 11] 


where A € R and x" € R”, as well as the so-called 
special conformal transformations, 


paa REO 2 
1+2(x-a)+x%a? 

where a"ER” and x^—x"x,. (Note that this last 

transformation is only defined for x“ 4 —a!/a?.) 

If the dimension z of the space R" is larger than 2, 
one can show that the full conformal group is 
generated by these transformations. For n=2, 
however, the group of (locally defined) conformal 
transformations is much larger. To see this, it is 
convenient to introduce complex coordinates for 
(x,y) € R? by defining z=x+iy and z—x- iy. 
Then any (locally) analytic function f(z) defines a 
conformal transformation by z — f(z), since analytic 
maps preserve angles. (Incidentally, the same also 
applies to z= f(z), but this would reverse the 
orientation.) Clearly, the group of such transforma- 
tions is infinite dimensional; this is a special feature 
of two dimensions. 

In this complex notation, the transformations that 
are generated by translations, rotations, dilatations, 
and special conformal transformations simply gen- 
erate the Móbius group of automorphisms of the 
Riemann sphere 


az+b 
- 3 
ze) [3] 
where a,b,c,d are complex constants with 


ad — bc Æ 0; since rescaling a,b,c,d by a common 
complex number does not modify [3], the Mobius 
group is isomorphic to SL(2,C)/Z2. In addition to 
these transformations (that are globally defined on the 
Riemann sphere), we have an infinite set of infinitesi- 
mal transformations generated by L,:z z + ez"! 
for all n € Z. The generators L +1 and Lo generate the 
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subgroup of Mobius transformations, and their com- 
mutation relations are simply 


Los, Ls] = (m - N) min [4] 


In fact, [4] describes also the commutation relations 
of all generators L, with n € Z: this is the Lie 
algebra of (locally defined) 2D conformal transfor- 
mations — it is called the Witt algebra. 


The General Structure of Conformal 
Field Theory 


A 2D conformal field theory is determined (like any 
other field theory) by its space of states and the 
collection of its correlation functions (vacuum 
expectation values). The space of states is a vector 
space H (which, in many interesting examples, is a 
Hilbert space), and the correlation functions are 
defined for collections of vectors in some dense 
subspace of H. These correlation functions are 
defined on a 2D (Euclidean) space. We shall mainly 
be interested in the case where the underlying 2D 
space is a closed compact surface; the other 
important case concerning surfaces with boundaries 
(whose analysis was pioneered by Cardy) will be 
reviewed elsewhere (see the article Boundary Con- 
formal Field Theory). The closed surfaces are 
classified (topologically) by their genus g, which 
counts the number of handles; the simplest such 
surface which we shall mainly consider is the sphere 
with g — 0, the surface with g= 1 is the torus, etc. 

One of the special features of conformal field 
theory is the fact that the theory is naturally defined 
on a Riemann surface (or complex curve), that is, on 
a surface that possesses suitable complex coordi- 
nates. In the case of the sphere, the complex 
coordinates can be taken to be those of the complex 
plane that cover the sphere except for the point at 
infinity; complex coordinates around infinity are 
defined by means of the coordinate function 
y(z)=1/z that maps a neighborhood of infinity to 
a neighborhood of zero. With this choice of complex 
coordinates, the sphere is usually referred to as the 
Riemann sphere, and this choice of complex 
coordinates is, up to Moobius transformations, 
unique. The correlation functions of a conformal 
field theory that is defined on the sphere are thus of 
the form 


(0| V (ay ; 21,21) =e V (dn; Zn, 2310) [5] 


where V(w,z,z) is the field that is associated to the 
state i», and z; and z; are complex conjugates of one 
another. Here |0) denotes the SL(2, C)/75-invariant 
vacuum. The usual locality assumption of a 2D 


(bosonic) Euclidean quantum field theory implies 
that these correlation functions are independent of 
the order in which the fields appear in [5]. 

It is conventional to think of z=0 as describing 
“past infinity," and z —oo as “future infinity”; this 
defines a time direction in the Euclidean field theory 
and thus a quantization scheme (radial quantiza- 
tion). Furthermore, we identify the space of states 
with the space of “incoming” states; thus, the state i 
is simply 


V = V(x; 0, 0)|0) [6] 


We can think of z; and z; in [5] as independent 
variables, that is, we may relax the constraint that 2; 
is the complex conjugate of z;. Then we have two 
commuting actions of the conformal group on these 
correlations functions: the infinitesimal action on 
the z; variables is described (as before) by the L, 
generators, while the generators for the action on 
the z; variables are L,. In a conformal field theory, 
the space of states H thus carries two commuting 
actions of the Witt algebra. The generator Lo + Lo 
can be identified with the time-translation operator, 
and thus describes the energy operator. The space of 
states of the physical theory should have a bounded 
energy spectrum, and it is thus natural to assume 
that the spectrum of both Lo and Lo is bounded 
from below; representations with this property are 
usually called positive-energy representations. It is 
relatively easy to see that the Witt algebra does not 
have any unitary positive-energy representations 
except for the trivial representation. However, as is 
common in many instances in quantum theory, it 
possesses many interesting projective representa- 
tions. These projective representations are conven- 
tional representations of the central extension of the 
Witt algebra 


¡Ems Ln] = (200) Liman +33 MR — 1)... [7 


which is the famous Virasoro algebra. Here c is a 
central element that commutes with all Lm; it is 
called the central charge (or conformal anomaly). 
Given the actions of the two Virasoro algebras 
(that are generated by L, and L,), one can 
decompose the space of states H into irreducible 


representations as 


n= E MH; ® Hj [8] 
ij 


where H;(H;) denotes the irreducible representations 
of the algebra of L,(L,), and M; € No describe the 
multiplicities with which these combinations of 
representations occur. (We are assuming here that 
the space of states is completely reducible with 
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respect to the action of the two Virasoro algebras; 
examples where this is not the case are the so-called 
logarithmic conformal field theories.) The positive- 
energy representations of the Virasoro algebra are 
characterized by the value of the central charge, as 
well as the lowest eigenvalue of Lo; the state y 
whose Lo eigenvalue is smallest is called the highest- 
weight state, and its eigenvalue Lov-— hw is the 
conformal weight. The conformal weight determines 
the conformal transformation properties of v: under 
the conformal transformation z  f(z),z—f(z), we 
have 


V(v;z,z) 
(FO) FO)’ Vwxf()f)) [9 


where Lov — b and Lo — by. The corresponding 
field V(v;z,z) is then called a primary field; if [9] 
only holds for the Möbius transformations [3], the 
field is called quasiprimary. 

Since L,, with m > 0 lowers the conformal weight 
of a state (see [7]), the highest-weight state y is 
necessarily annihilated by all L, (and Lm) with m > 0. 
However, in general the Lm (and Lm) with m < 0 
do not annihilate 4; they generate the descendants 
of w that lie in the same representation. Their 
conformal transformation property is more compli- 
cated, but can be deduced from that of the primary 
state [9], as well as the commutation relations of 
the Virasoro algebra. 

The Mobius symmetry (whose generators annihi- 
late the vacuum) determines the 1-, 2- and 3-point 
functions of quasiprimary fields up to numerical 
constants: the 1-point function vanishes, unless 
h=h=0, in which case (0|V(v;z,z)|0) — C, inde- 
pendent of z and z. The 2-point function of v and 
Y vanishes unless hı =h and b,-—b;; if the 
conformal weights agree, it takes the form 


(0|V (v1; z1, Zi )V (wa: 22, 22)|0) 
—2h 


= Cle — a, — Z2) [10] 
Finally, the structure of the 3-point function of three 
quasiprimary fields %1, V», and v is 


(OIV (v1; z1, 21) V(V2; 22, 22) V (1/3; 23, Z3)|0) 
=C [6-48 agen» — mp 


i<j 


where for each pair i < j,k labels the third field, that 
is, R4i and k#j. The Möbius symmetry also 
restricts the higher correlation function of quasi- 
primary fields: the 4-point function is determined up 
to an (undetermined) function of the Mobius 
invariant cross-ratio, and similar statements also 


hold for n-point functions with n > 5. The full 
Virasoro symmetry must then be used to restrict 
these functions further; however, since the genera- 
tors L, with n < 2 do not annihilate the vacuum 
10), the Virasoro symmetry leads to Ward identities 
that cannot be easily evaluated in general. (In typical 
examples, the Ward identities give rise to differential 
equations that must be obeyed by the correlation 
functions.) 


Chiral Fields and Vertex Operator 
Algebras 


The decomposition [8] usually contains a special 
class of states that transform as the vacuum state 
with respect to Lm; these states are the so-called 
chiral states. (Similarly, the states that transform as 
the vacuum state with respect to L,, are the 
antichiral states.) Given the transformation proper- 
ties described above, it is not difficult to see that the 
corresponding chiral fields V(v;z,z) only depend on 
z in any correlation function, that is V(v5z,z) = 
V(v, z). (Similarly, the antichiral fields only depend 
on Z.) The chiral fields always contain the field 
corresponding to the state L 5|0), that describes a 
specific component of the stress-energy tensor. 

In conformal field theory, the product of two 
fields can be expressed again in terms of the fields of 
the theory. The conformal symmetry restricts the 
structure of this operator product expansion: 


V(Va;z1,21) V(V2; 22, 22) 
= 6 22) (a — 22)" 


Y V(¢i ,322,%2)(z1 —22) (4 —z2y [12] 


r.s>0 


where A; and Aj are real numbers, and r,s € No. 
(Here ; labels the conformal representations that 
appear in the operator-product expansion, while r 
and s label the different descendants.) The actual 
form of this expansion (in particular, representations 
that appear) can be read off from the correlation 
functions of the theory since the identity [12] has to 
hold in all correlation functions. 

Given that the chiral fields only depend on z in all 
correlation functions, it is then clear that the 
operator-product expansion of two chiral fields 
again only contains chiral fields. Thus, the subspace 
of chiral fields closes under the operator-product 
expansion, and therefore defines a consistent (sub)- 
theory by itself. This subtheory is sometimes referred 
to as a meromorphic conformal field theory (Goddard 
1989). (Obviously, the same also applies to the 
subtheory of antichiral fields.) The operator-product 
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expansion defines a product on the space of mero- 
morphic fields. This product involves the complex 
parameters z; in a nontrivial way, and therefore does 
not directly define an algebra structure; it is, however, 
very similar to an algebra, and is therefore usually 
called a vertex operator algebra in the mathematical 
literature. The formal definition involves formal 
power series calculus and is quite complicated; details 
can be found in (Frenkel-Lepowski-Meurman 1988). 

By virtue of its definition as an identity that holds 
in arbitrary correlation functions, the operator- 
product expansion is associative, that is, 


(V (15 21,21) V( 2; 22, 22)) V(V3: 23, 23) 
= V(u1; 21, 21)(V (%2; 22,22)V(13;23,23)) [13] 


where the brackets indicate which operator-product 
expansion is evaluated first. If we consider the case 
where both 41 and y are meromorphic fields, then 
the associativity of the operator-product expansion 
implies that the states in H form a representation of 
the vertex operator algebra. The same also holds for 
the vertex operator algebra associated to the anti- 
chiral fields. Thus the meromorphic fields encode in 
a sense the symmetries of the underlying theory: this 
symmetry always contains the conformal symmetry 
(since L 5|0) is always a chiral field, and L 5|0) 
always an antichiral field). In general, however, the 
symmetry may be larger. In order to take full 
advantage of this symmetry, it is then useful to 
decompose the full space of states H not just with 
respect to the two Virasoro algebras, but rather with 
respect to the two vertex operator algebras; the 
structure is again the same as in [8], where, 
however, each H; and H; is now an irreducible 
representation of the chiral and antichiral vertex 
operator algebra, respectively. 


Rational Theories and Zhu's Algebra 


Of particular interest are the rational conformal 
field theories that are characterized by the property 
that the corresponding vertex operator algebras only 
possess finitely many irreducible representations. 
(The name "rational" stems from the fact that the 
conformal weights and the central charge of these 
theories are rational numbers.) The simplest exam- 
ple of such rational theories are the so-called 
minimal models, for which the vertex operator 
algebra describes just the conformal symmetry: 
these models exist for a certain discrete set of 
central charges c « 1 and were first studied by 
Belavin, Polyakov, and Zamolodchikov in 1984. 
(Their paper is contained in the reprint volume of 
Goddard and Olive (1988).) It was this seminal 


paper that started many of the modern develop- 
ments in conformal field theory. Another important 
class of examples are the Wess-Zumino- Witten 
(WZW) models that describe the world-sheet theory 
of strings moving on a compact Lie group. The 
relevant vertex operator algebra is then generated by 
the loop group symmetries. There is some evidence 
that all rational conformal field theories can be 
obtained from the WZW models by means of two 
standard constructions, namely by considering 
cosets and taking orbifolds; thus rational conformal 
field theory seems to have something of the flavor of 
(reductive) Lie theory. 

Rational theories may be characterized in terms of 
Zhu's algebra that can be defined as follows. The 
chiral fields V(14,z) that only depend on z must by 
themselves define local operators; they can therefore 
be expanded in a Laurent expansion as 


V(U,z) 9 Y Va(w)z "> [14] 


nc7. 


where h is the conformal weight of the state v. For 
example, for the case of the holomorphic compo- 
nent of the stress-energy tensor one finds 


T(z) — y, Luc S [15] 


where the L, are the Virasoro generators. By the 
state/field correspondence [6], it then follows that 


Vi, (v)10)=0 forn > —h [16] 
and that 
V_4(w)|0) 2v [17] 


(For an example of the above component of the 
stress-energy tensor, [16] implies that L_;|0)= 
Lo |0) =L,|0) =0 for n > 0 — thus the vacuum is in 
particular SL(2, C)/Z2 invariant. Furthermore, [17] 
shows that L-2 | 0) is the state corresponding to this 
component of the stress-energy tensor.) We denote 
by Ho the space of states that can be generated by 
the action of the modes V,(w) from the vacuum |0). 
On Ho we consider the subspace O(Ho) that is 
spanned by the states of the form 


V™ (ur), 
where VW (1) is defined by 


h 
VQ) Sh) Vat) 119 


n=0 


N>0 [18] 


and P is the conformal weight of 4. Zhu's algebra is 
then the quotient space 


A = Ho/O(Ho) 20] 
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It actually forms an associative algebra, where the 
algebra structure is defined by 


px x = VU Dx [21] 


This algebra structure can be identified with the 
action of the “zero-mode algebra” on an arbitrary 
highest-weight state. 

Zhu’s algebra captures much of the structure of 
the (chiral) conformal field theory: in particular, it 
was shown by Zhu in 1996 that the irreducible 
representations of A are in one-to-one correspon- 
dence with the representations of the full vertex 
operator algebra. A conformal field theory is thus 
rational (in the above, physicists’, sense) if Zhu’s 
algebra is finite dimensional. (In the mathematics 
literature, a vertex operator algebra is usually called 
rational if in addition every positive-energy repre- 
sentation is completely reducible. It has been 
conjectured that this is equivalent to the condition 
that Zhu’s algebra is semisimple.) 

In practice, the determination of Zhu’s algebra is 
quite complicated, and it is therefore useful to 
obtain more easily testable conditions for rational- 
ity. One of these is the so-called C5 condition of 
Zhu: a vertex operator algebra is C;-cofinite if the 
quotient space Ho/O2(Ho) is finite dimensional, 
where ©3(Ho) is spanned by the vectors of the form 


V , 5 (V)x, n21 [22] 


It is easy to show that the C>-cofiniteness condition 
implies that Zhu's algebra is finite dimensional. 
Gaberdiel and Neitzke have shown that every 
C,-cofinite vertex operator algebra has a simple 
spanning set; this observation can, for example, be 
used to prove that all the fusion rules (see below) of 
such a theory are finite. 


Fusion Rules and Verlinde's Formuia 


As explained above, the correlation function of three 
primary fields is determined up to an overall 
constant. One important question is whether or not 
this constant actually vanishes since this determines 
the possible *couplings" of the theory. This infor- 
mation is encoded in the so-called fusion rules of the 
theory. More precisely, the fusion rules Nj € No 
determine the multiplicity with which the represen- 
tation of the vertex operator algebra labeled by £ 
appears in the operator-product expansion of the 
two representations labeled by i and j. 

In 1988, Verlinde found a remarkable relation 
between the fusion rules of a vertex operator 
algebra and the modular transformation properties 
of its characters. To each irreducible representation 


H; of a vertex operator algebra, one can define the 
character 


xilT) = try, Cun) , q= e^riT [23] 


For rational vertex operator algebras (in the math- 
ematical sense) these characters transform under the 
modular transformation 7 — —1/7 as 


x(-1/r) = > Sixi(7) [24] 


where $;; are constant matrices. Verlinde's formula 
then states that, at least for unitary theories, 


Si SaS 
Nit = Y 2 [25] 


where the “0” label denotes the vacuum representa- 
tion. A general argument for this formula has been 
given by Moore and Seiberg in 1989; very recently, 
this has been made more precise by Huang. 


Modular Invariance and the Conformal 
Bootstrap 


Up to now, we have only considered conformal field 
theories on the sphere. In order for the theory to be 
well defined also on higher-genus surfaces, it is 
believed that the only additional requirement comes 
from the consistency of the torus amplitudes. In 
particular, the vacuum torus amplitude must only 
depend on the equivalence class of tori that is 
described by the modular parameter 7 € H, up to 
the discrete identifications that are generated by the 
usual action of the modular group SL(2, Z) on the 
upper half-plane H. For the theory with decomposi- 
tion [8] this requires that the function 


Z(r,7)= 5 Mixilr)xi(7) 26] 
y 


is invariant under the action of SL(2, Z). This is a 
very powerful constraint on the multiplicity matrices 
Mj; that has been analyzed for various vertex 
operator algebras. For example, Cappelli, Itzykson, 
and Zuber have shown that the modular invariant 
WZW models corresponding to the group SU(2) 
have an A-D-E classification. The case of SU(3) was 
solved by Gannon, using the Galois symmetries of 
these rational conformal field theories. 

The condition of modular invariance is relatively 
easily testable, but it does not, by itself, guarantee that 
a given space of states H comes from a consistent 
conformal field theory. In order to construct a 
consistent conformal field theory, one needs to solve 
the conformal bootstrap, that is, one has to determine 
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all the normalization constants of the correlators so 
that the resulting set of correlators is local and 
factorizes appropriately into 3-point correlators 
(crossing symmetry). This is typically a difficult 
problem which has only been solved explicitly for 
rather few theories, for example, the minimal models. 
Recently, it has been noticed that the conformal 
bootstrap can be more easily solved for the corre- 
sponding boundary conformal field theory. Further- 
more, Fuchs, Runkel, and Schweigert have shown that 
any solution of the boundary problem induces an 
associated solution for conformal field theory on 
surfaces without boundary. This construction relies 
heavily on the relation between 2D conformal field 
theory and 3D topological field theory (Turaev 1994). 


See also: Boundary Conformal Field Theory; 
Compactification of Superstring Theory; Current Algebra; 
Knot Theory and Physics; String Field Theory; 
Superstring Theories; Symmetries in Quantum Field 
Theory of Lower Spacetime Dimensions. 
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Introduction 


The Ising model is a model of a classical ferro- 
magnet on a lattice first introduced in 1925 in the 
one-dimensional case by E Ising. At each lattice site 
there is a “spin” variable c, which takes on the 
values +1 (spin up) and —1 (spin down). The mutual 
interaction energy of the pair of spins oa and ow, 
where a and a’ are nearest neighbors, is —E(a, a’) if 
O,=0, and is Ela, a!) if 0, = —o,y. In addition, the 
spins can interact with an external magnetic field as 
—Ho,. On a square lattice, where j specifies the row 
and k specifies the column, the interaction energy 
for the homogeneous case where E,(o,o') and 
E (o, a’) are independent of the position a, o/ may 
be explicitly written as 


E(H) — — S [Ep 05 40; 41 + Ey0¡ 20541, + Hoje] (1) 
j,k 
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This very simple model [1] has the remarkable 
property that in two dimensions at H —0 many 
properties of physical interest can be computed 
exactly. Furthermore, the model has a ferromagnetic 
phase transition at a critical temperature T., at 
which the specific heat diverges and the magnetic 
susceptibility diverges to infinity and below which 
there is a nonzero spontaneous magnetization. In 
addition, the microscopic correlations between spins 
can also be exactly computed. These exact calcula- 
tions are the basis of the modern theory of second- 
order phase transitions used to analyze real ferro- 
magnets and real fluids near their critical points in 
both two and three dimensions. The model may also 
be interpreted as a lattice gauge theory. 


Solvability 


The solvability of the Ising model at H=0 was 
discovered by Onsager in 1944 in one of the most 
profound and inventive papers ever written in 
mathematical physics. Onsager discovered that the 
model possesses an infinite-dimensional symmetry, 
which allowed him to exactly compute the free 


energy per site. This symmetry is generated by the 
relations 


[Az, Am] ER 4G s 
[G;, An] = LA mt E ZA nl [2] 
|G), Gi] el 


This algebra of Onsager is a subalgebra of what is 
now called the loop algebra of the Lie algebra Sl; 
and it is the first infinite-dimensional algebra to be 
used in physics. 

In the 60 years since Onsager first computed the 
free energy, several other methods of exact solution 
have been found. In 1949, Kaufman reduced the 
computation of the free energy to a problem of free 
fermions. A closely related combinatorial method 
was invented by Kac and Ward, Hurst and Green, 
and by Kastelyn. Baxter (1982) has computed the 
free energy by means of star triangle equations and 
functional equations in his book. 

The fermionic and the combinatorial methods are 
powerful enough to compute the correlation func- 
tions but are not generalizable to other models. The 
functional equation methods of Baxter generalize to 
many other important models but they do not give 
correlation functions. There are still aspects of 
Onsager's method that remain unexplored. 

The free energy per site in the thermodynamic 
limit is defined as 


F = —kgT dim NO In Zi(H) [3] 


where N is the total number of sites of the lattice 
and the partition function Zj(H) is defined as 


Zi(H) = eE D / Rs T [4] 
all o=+1 


with the sum being over all values o; , = +1 and kg 
is Boltzmann’s constant. The result of Onsager is 
that, at H — 0, 


27 2T 
F/kT =In2+ 5 | dé; də In [cosh 2E,/RgT 
J 0 0 


87? 
x cosh 2E,/kpT — sinh 2E,/RgT cos 6; 
— sinh 2E, /kgT cos 2 [5] 


This free energy has a singularity at a temperature 
T. defined from 


sinh(2E, / by T.) sinh(2Ey / bg T.) = 1 [6] 


and near T; the specific heat diverges as 


ë ra E? sinh? 2E,/ ks T. + 2E,Ep 


Lu. ( 
kgT?r 
+E sinh? 2En/kyT:) In |1 — T/T,| [7] 
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The next property to be computed was the 


spontaneous magnetization, which is usually 
defined as 
AL. = im. M(H) [8] 


However, because solution is only available at 
H=0, this definition cannot be used and instead 
M_ is computed from an alternative definition in 
terms of the spin=spin correlation function 


1 


— —E(0)/kn T 
«00,.00M,N2 = Dv 00,00 M,N€ [9] 
Zı(0) 2. 
as 
M? = lim «00.00 M.N2 [10] 
M?--N?— o6 


The result for M , first announced by Onsager in 
1949, is 


M = (1 — gus for I = Le [11] 
i 0 for T > T. 


k = (sinh2E,/kgT sinh2E,/kgT) | — [12] 


A key point in the computation of the magnetiza- 
tion [11] from [9] is that the spin-spin correlation 
function can be written as a determinant. In fact, 
there are many such different, but equal, determi- 
nental representations and the size of the smallest 
one in general is 2(|M| + |N|). The simplest case is 
the diagonal correlation 


a a_| a '** AJ=N 
a, ao a a2—N 
«00,00 N.ND = a» di ay Q3—N [13] 
QN-1 GN-2 4N-3 **: a 


where 


1 AA e ke NT 
"y ;] + : = 14] 


Determinants of the form [13], where the elements 
on each diagonal are equal, are called Toeplitz. 
The study of the spin-spin correlations of the 
Ising model provides a microscopic picture of the 
behavior of the ferromagnet near the phase transition 
temperature T., and an entire branch of mathematics 
has developed from the study of the behavior of 
Toeplitz determinants when the size is large. The first 
such mathematical advance was the discovery by 
Szegó of a general formula for the limit as N — 00, 
from which the magnetization [11] is computed. 
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The simplest result for the approach to this N — oc 
limit is the behavior of the diagonal correlation at 
T — T.(k — 1), where [13] exactly reduces to 


2 AN N-1 1 15N 
<00,009N.N> = (=) JI 1 — al [15] 
which behaves as N — oo as 

«0000N.N» ~ AN ^ [16] 


where A ~ 0.6450--- is a transcendental constant. 
Further results for large N and T fixed are for 
T <'T tk <1). 


2 UN+1) 


et} [17] 


BN 
(AN ^ (1 — k2)/* 


«00,00 N,N2 ^v mèfi T 
and for T > T-(k > 1), 


X00,00 N,N7 ^v [18] 

By comparing [16] with [17] and [18], we see that 
at T = T, the correlations decay algebraically but for 
T Æ T; the decay is exponential. It is useful to write 
the exponential in [17] for T < Te as 


k^ —eN- with £! — —In k [19] 
and in [18] for T > T, as 
kN — e^ N/S. with & ! — In k [20] 


The quantity £ is called the correlation length and as 
T — T; the correlation length diverges as 


Ey ~ |1— k|? = const. |T — gr [21] 


A more profound property of the correlations is 
that they satisfy differential and difference equa- 
tions. It was found by Jimbo and Miwa (1980) that 
the diagonal correlation function satisfies the non- 
linear differential equation related to the sixth 
Painlevé function 


2 2 
(ue T 
do > 
= Ne ((- T) 


a (eno) (rg - o) [22] 


where for T < T, we set t — k& ? and 


d 1 
on(t) = t(t — 1) «00.00 N.N2 E [23] 


and for T > T, we set t = k? and 


on(t) = t(t = 1) in «00.00 N.N? Z [24] 
Furthermore it was found by McCoy et al. (1981) 
that for a given temperature the general two spin 
correlation function and all multipoint correlations 
satisfy quadratic nonlinear partial difference equa- 
tions in the locations of the spins. 


Scaling Theory 


It is evident that the results [17] and [18] do not 
reduce to [16] when k— 1. Therefore, in order to 
uniformly characterize the behavior of the correla- 
tion function in the critical region near Te, it is 
necessary to introduce what is called the scaling 
function. This uniform expansion is obtained by 
introducing a scaled length defined as 


r=N/E [25] 
and considering the joint (scaling limit) where 
N-5oo and T—o with fixed [26] 
We define the scaled correlation function as 


G.(r) = lim Mj? «00.00 N.N2 [27] 
scaling 
where the subscript + means that the limit is taken 
from T >T. or T < Te, respectively, M. is the 
spontaneous magnetization [11] and 


M, — (k? - 1) [28] 


This concept of the scaling limit and scaling 
function is very general and can be defined for any 
system with a critical point that has an order 
parameter like M. that vanishes at T; and a 
correlation length that diverges at T.. However, 
the Ising model has the further remarkable property 
discovered by Wu et al. (1976) that the scaled 
correlation function may be explicitly expressed in 
terms of a function which satisfies an ordinary 
nonlinear differential equation. Specifically, 


G.(r) = 51x ne/2)(/2) ^ 


xexp [rua - y Y) 29) 
r/2 


where the function 7(r) satisfies the Painlevé III 
equation 


=i- +p- po 
7) 1 4 ] 


with the boundary condition that 


n(r) ~1—2XKo(2r) as r—co [31] 


where Ko(r) is the modified Bessel function of the 


third kind and 
N= life [32] 
The leading behavior of G.(r) for r — oo is 


G(r) ~ AKo(r) [33] 


G(r) ~ 144 PIK) - KRO) 
— rKo(r)K4(r) + Kio) [34] 


where K,(z) is the modified Bessel function of the 
third kind. When A is given by [32] these r — oo 
limits of Gi(r) agree with the behavior of 
«09,00N,N» for N>>1 and |T—T,| small with 
N|T — T¿| > 1 which is obtained from [18] and 
[17]. The behavior of G(r) for r — 0 with the value 
of A given by [32] is 


G(r) = const. 7 !/^ [35] 


where the constant agrees with that computed from 
the result [16] for < oo,00n,n > at T — T, for N > 1. 
For other values of the boundary condition constant A, 
the scaling function G.(r) diverges with a power 
which differs from 1/4. The computation of the 
constant in [35] requires the evaluation of a nontrivial 
integral involving the Painlevé III function. 

The agreement of the limits r— oo and r— 0 of 
the function G(r) with the lattice results near T. 
means that this scaling function uniformly inter- 
polates between T Z T, and T— T, and that the 
lattice size (defined here as unity) and the self- 
generated correlation length € are the only two 
length scales in the theory. This feature that the 
system generates only one new length scale near T; 
is referred to as one length scale scaling. 


Susceptibility 
The final quantity of macroscopic thermodynamic 
interest is the magnetic susceptibility 
OM(H) 
T)= 
OH 


[36] 


H=0 


which is expressed in terms of the spin-spin 
correlation function as 


X(T) === {<oooomn> -M2} B7 
M,N 


TE 
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The susceptibility may be studied by using the 
determinental expression for the correlation func- 
tion. The simplest result is obtained (for the 
isotropic case, E, — Ej) by using the scaling form 
[27] to find for T ~ T, that 


ks T x4 (T) ~ Mi &2n / drrí[G,-—0,) [38] 
0 


where O, =0 and O. — 1. and thus x+(T) diverges 
at T — T. as 


XT) e CIT TA [39] 


where C, are transcendental constants given as 
integrals over the scaling function G(r), which 
were first evaluated by Barouch et al. in 1973 as 


C = OL0ZSSS697 19... x, 


[40] 
C = 0.9625817322... 


Critical-Exponent Phenomenology 


From the behavior for the Ising model of the 
specific heat, magnetization, susceptibility, corre- 
lation length, and the correlation at T; given 
above we abstract for general systems the phe- 
nomenological critical-exponent parametrization 
for T T.-+ of 


co AAT —T, = [41] 

M ~ Ay|T- -T [42] 
+ — Y+ 

x ~ AT — Te” [43] 

En AFIT — T.| 5 [44] 


and at T — T. for R— oo 
«090g»  Ags/ RT" where dis the dimension [45] 


The exponents a+, Y+, v+ above and below T; are usually 
found to be equal, and the exponent 77 is usually called the 
anomalous dimension. If it is assumed that the scaling 
function [27] exists and that one length scale scaling holds 
then the exponents are related by what are called scaling 
laws, such as 


28 —v (d—2-41) [46] 
œa- +2ß-y-=2 [47] 
dv. —2-—a. [48] 


Thus, from the properties of the Ising model near 
T., we have obtained a phenomenology for use on 
all systems near the critical point. 
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Fuchsian Equations and Natural 
Boundaries for Susceptibility 


This critical phenomenology, however, has not 
taken into account the fact that the susceptibility 
is a much more complicated function than either 
the spontaneous magnetization [11] or the free 
energy [5], which have only isolated singularities at 
k*=1, and that there is more structure to the 
susceptibility than the singularity of [39]. 

For arbitrary T, the susceptibility was shown by 
Wu, McCoy, Tracy, and Barouch to be expressible 
in the form 


X«(T) = MZ S xU (T) [49] 
j 


where in the sum / is odd (even) for T above (below) 
T.. The quantities X (T) are explicitly given as 
j-fold integrals of algebraic functions and thus will 
satisfy linear differential equations with polynomial 
coefficients. Such functions can have only isolated 
singularities. The function X! (T) is elementary and 
has a double pole at T. and X (T) is given in terms 
of complete elliptic integrals. Quite recently, 
remarkable Fuchsian linear differential equations 
for Xx? (T) and X * (T) of seventh and tenth orders, 
respectively, have been obtained by Zenine, Bouk- 
raa, Hassani, and Maillard for the isotropic lattice. 

Furthermore, it was shown by Orrick et al. (2001) 
that X has singularities in the complex T plane at 


cosh(2E,/RT) cosh(ZE, /RT) 
— sinh(2E,/RT) cos(27/7) 
—sinh(2E,/RT) cos(27m' /j) = 0 [50] 


with m,m'=1,2,...,j. The form of the singularity 
in XP (T) for T > Te is as 


(7? Ine [51] 
and, for T < Te, it is as 
eP 32 [52] 


where e measures the deviation from the singular 
point [50]. These singularities become dense as 
joo and, therefore, the singularity at T — T. is 
not isolated and instead the critical point is 
embedded in a natural boundary. Such a function 
cannot satisfy a linear differential equation of finite 
order with polynomial coefficients. 

The existence of the natural boundary in the 
susceptibility is a new phenomenon which is not 
seen in either the free energy or magnetization and 
leads to the speculation that in the presence of a 
magnetic field the one length scale scaling property 
of the model at H — 0 may fail. If this proves to be 


correct, there will be physical effects which are not 
incorporated in the phenomenological scaling theory 
of critical phenomena. 


Impure Ising Models 


The Ising model may also be studied when the 
interaction energies at sites j,k are not chosen to be 
independent of position but are allowed to vary 
from site to site. When these interactions are chosen 
randomly out of some probability distribution, this 
is a model of a ferromagnet with frozen (quenched) 
impurities. All real systems will be impure to some 
extent, so the study of such dirty systems is of great 
practical importance. 

The special case where the interactions are transla- 
tionally invariant in the horizontal direction but are 
allowed to vary in a layered fashion from row to row 
was introduced by McCoy and Wu in 1968 and 
found to be dramatically different from the pure Ising 
model described above. In particular, what is a 
critical temperature T; in the pure case is now spread 
out into a region bounded by the temperatures the 
pure model would be critical if all the bonds took on 
the minimum or maximum value allowed by the 
probability distribution. In this new region, the 
correlations (in the direction of translational invar- 
iance) are found to decay as a power law which 
depends on the temperature; the specific heat is never 
infinite but the susceptibility is infinite in an entire 
temperature regior that includes the temperature at 
which the spontaneous magnetization first appears as 
T is lowered. The existence of this new region for 
Ising models with a general randomness in two and 
chree dimensions has been demonstrated by Griffiths. 
More recently, this effect has been reinterpreted in 
terms of impurities in quantum spin chains. 


Quantum Field Theory 
The Ising model of [1] may be reinterpreted as a two- 
dimensional lattice gauge theory of the gauge field 
Si+1/2k = El 

on the vertical link between (j,k) and (j + 1,k) 
Sik+1/2 = El 

on the horizontal link between (j, k) 


and (j,k + 1) [53] 


and a “Higgs” field 


bk = +1 on the site (j,k) [54] 


with the action 


Sg =—Ey xk Si41/2,S54+1,k+1/25541/2,k4+155,k+1/2 
E: 


By > (Giesint 12. Dj+r1, +P RS R+1/2Pj +1) [55] 
T 


If we define 
Zg h = tanh E, /RgT [56] 


the partition function of the gauge theory is expressed 
in terms of the Ising model partition function as 


Zg — [8 cosh(E; /ka T) cosh” 
x (Ex/ka Tz? zy]^ Zi(H) [57] 


where we make the identification 


H/kgT —1lnz, and E/k&T —llnz, [58] 


This identification may be extended to correlation 
functions. Of particular interest for the gauge theory is 
the plaquette-plaquette correlation < Po, oP; ; >, where 


Pik = Sj+1 /2,kSj+1,k+1/2$j+1/2,k+1$j,k+1/2 [59] 


which is expressed in terms of the Ising correlations 
at H #0 as 
< PooP;, > — «Poo»? 


— sinh? (2H /kgT) («00,90 Oik > — «0002 m [60] 


To study this correlation further, we need to study 
the correlations of the Ising model in nonzero 
magnetic field. This has been done by McCoy and 
Wu in the scaling limit H — 0, T — T, with 

H 


for T < T¿, where it is found that the scaling 
function G(r, 5) for small hand large r if 


b) ~ Y abKo (2 + Pr 
l 


9 2/3 [62] 
qii l2, -2r y bae tP A 
l 


where A, are the solutions of 


hi ($29) + J-5 ($297) =0 [63] 


with /,(z) the Bessel function of order n and Ko(z) 
the modified Bessel function of the third kind. 

A field theory is said to possess a particle spectrum 
if the Fourier transform of the two-point function 


G(k,h) = / d're*"G(r, b) [64] 


Two-Dimensional Ising Model 327 


has poles of the form A;/(k* + m;), where 77, is the 
mass of the /th particle. If we note that the Fourier 
transform of Ko(r) is 


/ d're*Ko(r) = ES ; 65) 


we see that the Fourier transform of [62] is the sum of 
an infinite number of poles. This is to be compared 
with the Fourier transform of the scaled correlation 
function G_(r) at H=0 and T < T. [34], which does 
not contain any poles at all and may instead be 
interpreted as having a two-particle cut. This phe- 
nomenon of a cut at h = 0 breaking up into an infinite 
number of poles for h > 0 is a signal that at h = 0 the 
theory has free unconfined two-particle states which 
become weakly confined by a linear confining 
potential for h > 0. This confinement is thought to 
be a characteristic of most gauge theories. 


See also: Eight Vertex and Hard Hexagon Models; 
Holonomic Quantum Fields; Painlevé Equations; 
Percolation Theory; Phase Transitions in Continuous 
Systems; Statistical Mechanics and Combinatorial 
Problems; Toeplitz Determinants and Statistical 
Mechanics; Yang-Baxter Equations. 
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History and Motivation 


Local quantum physics of systems with infinitely many 
interacting degrees of freedom leads to situations 
whose understanding often requires new physical 
intuition and mathematical concepts beyond that 
acquired in quantum mechanics and perturbative 
constructions in quantum field theory. In this situa- 
tion, two-dimensional soluble models turned out to 
play an important role. On the one hand, they 
illustrate new concepts and sometimes remove mis- 
conceptions in an area where new physical intuition is 
still in the process of being formed. On the other hand, 
rigorously soluble models confirm that the underlying 
physical postulates are mathematically consistent, a 
task which for interacting systems with infinite degrees 
of freedom is mostly beyond the capability of 
pedestrian methods or brute force application of hard 
analysis on models whose natural invariances have 
been mutilated by a cutoff. 

In order to underline these points and motivate 
the interest in two-dimensional QFT, let us briefly 
look at the history, in particular at the physical 
significance of the three oldest two-dimensional 
models of relevance for statistical mechanics and 
relativistic particle physics, in chronological order: 
the Lenz-Ising (L-I) model, Jordan's model of 
bosonization/fermionization, and the Schwinger 
model (QED;). (A more detailed account of the 
changeful history concerning their correct physical 
interpretation and generalizations to higher dimen- 
sions of these models and the increasing conceptual 
role of low-dimensional models in QFT can be 
found in Schroer (2005).) 

The L-I model was proposed in 1920 by Wihelm 
Lenz (see Lenz (1925)) as the simplest discrete 
statistical mechanics model with a chance to go 
beyond the P Weiss phenomenological ansatz 
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involving long-range forces and instead explain ferro- 
magnetism in terms of nonmagnetic short-range 
interactions. Its one-dimensional version was solved 
four years later by his student Ernst Ising. Its changeful 
history reached a temporary conceptual climax when 
Onsager succeeded to rigorously establish a second- 
order phase transition in two dimensions. 

Another conceptually rich model which lay 
dormant for almost two decades as a result of a 
misleading speculative higher-dimensional general- 
ization by its protagonist is the bosonization/ 
fermionization model first proposed by Jordan 
(1937). This model establishes a certain equivalence 
between massless two-dimensional fermions and 
bosons and is related to  Thirring's massless 
4-fermion coupling model and also to Luttinger's 
one-dimensional model of an electron gas (Schroer). 
One reason why even nowadays hardly anybody 
knows Jordan's contribution is certainly the ambi- 
tious but unfortunate title *the neutrino theory of 
light" under which he published a series of papers. 

Both discoveries demonstrate the usefulness of 
having controllable low-dimensional models; at the 
same time, their complicated history also illustrates 
the danger of rushing to premature intuitive" 
conclusions about extensions to higher dimensions. 

A review of the early historical benchmarks of 
conceptual progress through the study of solvable 
two-dimensional models would be incomplete 
without mentioning Schwinger's (1962) proposed 
solution of two-dimensional quantum electrody- 
namics, afterwards referred to as the Schwinger 
model. He used this model in order to argue that 
gauge theories are not necessarily tied to zero-mass 
vector particles. Some work was necessary 
(Schroer) to unravel its physical content with the 
result that the would-be charge of that QED, 
model was “screened” and its apparent chiral 
symmetry broken; in other words, the model exists 
only in the so-called Schwinger-Higgs phase with 
massive free scalar particles accounting for its 
physical content. Another closely related aspect of 


this model which also arose in the Lagrangian 
setting of four-dimensional gauge theories was that 
of the 6-angle parametrizing, an ambiguity in the 
quantization. 

A coherent and systematic attempt at a mathema- 
tical control of two-dimensional models came in the 
wake of Wightman’s first rigorous programmatic 
formulation of QFT (Schroer 2005). This formula- 
tion stayed close to the physical ideas underlying the 
impressive success of renormalized QED perturba- 
tion theory, although it avoided the direct use of 
Lagrangian quantization. The early attempts 
towards a “constructive QFT” found their successful 
realization in two-dimensional QFT (the Py, models 
(Glimm and Jaffe 1987)) the restriction to low 
dimensions is related to the mild short-distance 
singularity behavior (super-renormalizability) which 
these methods require. We will focus our main 
attention on alternative constructive methods which, 
even though not suffering from such short-distance 
restrictions, also suffer from a lack of mathematical 
control in higher spacetime dimensions; the illustra- 
tion of the constructive power of these new methods 
comes presently from massless d — 1 + 1 conformal 
and chiral QFT as well as from massive factorizing 
models. 

There are several books and review articles 
(Furlan et al. 1989, Ginsparg 1990, Di Francesco 
et al. 1996) on d=1 +1 conformal as well as on 
massive factorizing models (Abdalla et al. 1991). To 
the extent that concepts and mathematical structures 
are used which permit no extension to higher 
dimensions (Kac-Moody algebras, loop groups, 
integrability, presence of an infinite number of 
conservation laws), this line of approach will not 
be followed in this article since our primary interest 
will be the use of two-dimensional models of QFT 
as “theoretical laboratories” of general QFT. Our 
aim is twofold; on the one hand, we intend to 
illustrate known principles of general QFT in a 
mathematically controllable context and on the 
other hand, we want to identify new concepts 
whose adaptation to QFT in d — 1 + 1 lead to their 
solvability (Schroer). 


General Concepts and Their 
Two-Dimensional Manifestation 


The general framework of QFT, to which the rich 
world of controllable two-dimensional models con- 
tributes as an important testing ground, exists in 
two quite different but nevertheless closely related 
formulations: the 1956 approach in terms of point- 
like covariant fields due to Wightman (see Streater 
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and Wightman (1964)) (see Axiomatic Quantum 
Field Theory), and the more algebraic setting which 
can be traced back to ideas which Haag (1992) 
developed shortly after and which are based on 
spacetime-indexed operator algebras and related 
concepts which developed over a long period of 
time, with contributions of many other authors to 
what is now referred to as algebraic QFT (AQFT) or 
simply local quantum physics (LQP). Whereas the 
Wightman approach aims directly at the (not 
necessarily observable) quantum fields, the opera- 
tor-algebraic setting (see Algebraic Approach to 
Quantum Field Theory) is more ambitious. It starts 
from physically well-motivated assumptions about 
the algebraic structure of local observables and aims 
at the reconstruction of the full field theory 
(including the operators carrying the superselected 
charges) in the spirit of a local representation theory 
of (the assumed structure of the) local observables. 
This has the advantage that the somewhat myster- 
ious concept of an inner symmetry (as opposed to 
outer (spacetime) symmetry) can be traced back to 
its physical roots which is the representation- 
theoretical structure of the local observable algebra 
(see Symmetries in Quantum Field Theory of Lower 
Spacetime Dimensions). In the standard Lagrangian 
quantization approach, the inner symmetry is part of 
the input (multiplicity indices of field components 
on which subgroups of U(n) or O(n) act linearly) 
and hence it is not possible to problematize this 
fundamental question. When in low-dimensional 
spacetime dimensions the sharp separation (the 
Coleman-Mandula theorems) of inner versus outer 
symmetry becomes blurred as a result of the 
appearance of braid group statistics, the standard 
Lagrangian quantization setting of most of the 
textbooks is inappropriate and even the Wightman 
framework has to be extended. In that case, the 
algebraic approach is the most appropriate. 

The important physical principles which are shared 
between the Wightman approach (see Streater and 
Wightman (1964)) and the operator algebra (AQFT) 
setting (Haag 1992) are the spacelike locality or 
Einstein causality (in terms of pointlike fields or 
algebras localized in causally disjoint regions) and 
the existence of positive-energy representations of 
the Poincaré group implementing covariance and the 
stability of matter. In the algebraic approach, the 
observable content of the theory is encoded into a 
family of (weakly closed) operator algebras 
(.A(O))oey indexed by a family of convex causally 
closed spacetime regions O (with O” denoting the 
spacelike complement and A’ the von Neumann 
commutant) which act in one common Hilbert space. 
Covariant local fields lose their distinguished role 
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which they have in the classical setting and which 
(via Lagrangian quantization) was at least partially 
inherited by the Wightman approach and, apart in 
their role as local generators of symmetries (con- 
served currents), became mere “field coordinatiza- 
tions” of local algebras. (There is a denumerable set 
of such pointlike field generators which form a local 
equivalence (Borchers) class of fields and in the 
absence of interactions permits a neat description in 
terms of Wick-ordered free-field polynomials (Haag 
1992). Certain properties cannot be naturally for- 
mulated in the pointlike field setting (e.g., Haag 
duality for convex regions A(O') —.A(O)'), but apart 
from those properties the two formulations are quite 
close; in particular for two-dimensional theories there 
are convincing arguments that one can pass between 
the two without imposing additional technical 
requirements. (Haag duality holds for observable 
algebras in the vacuum sector in the sense that any 
violation can be explained in terms of a sponta- 
neously broken symmetry; in local theories, it can 
always be enforced by dualization and the resulting 
Haag dual algebra has a charge superselection 
structure associated with the unbroken subgroup.) 
Haag duality is the statement that the commutant of 
observables not only contains the algebra of the 
causal complement that is, .A(O') C .A(O)' (Einstein 
causality) but is even exhausted by it; it is deeply 
connected to the measurement process and its 
violation in the vacuum sector for convex causally 
complete regions signals spontaneous symmetry 
breaking in the associated charge-carrying field 
algebra (Haag 1992). It can always be enforced 
(assuming that the wedge-localized algebras fulfill 
[1] below) by symmetry-reducing extension called 
Haag dualization. Its violation for multilocal region 
reveals the charge content of the model via charge- 
anticharge splitting in the neutral observable algebra 
(Schroer). 

Another physically important property which has 
a natural algebraic formulation is the split property: 
for regions ©; separated by a finite spacelike 
distance, one finds .A(O4 U O5) > .A(O4) & A(O)) 
which can be derived from the Buchholz-Wichmann 
“nuclearity property” (Haag 1992) (an appropriate 
adaptation of the “finiteness of phase-space cell” 
property of QM to QFT). Related to the Haag 
duality is the local version of the “time slice 
property" (the QFT counterpart of the classical 
causal dependency property) sometimes referred to 
as “strong Einstein causality” A(O”) = A(O)". 

One of the most astonishing achievements of the 
algebraic approach (which justifies its emphasis on 
properties of *local observables") is the DHR theory 
of superselection sectors (Doplicher ez al. 1971), 


that is, the realization that the structure of charged 
(nonvacuum) representations (with the superposi- 
tion principle being valid only within one represen- 
tation) and the spacetime properties of the 
generating fields which are the carriers of these 
generalized charges (including their spacelike com- 
mutation relations which lead to the particle 
statistics and also to their internal symmetry proper- 
ties) are already encoded in the structure of the 
Einstein causal observable algebra (Symmetries in 
Quantum Field Theory: Algebraic Aspects). The 
intuitive basis of this remarkable result (whose 
prerequisite is locality) is that one can generate 
charged sectors by spatially separating charges in the 
vacuum (neutral) sector and disposing of the 
unwanted charges at spatial infinity (Haag 1992). 

An important concept which especially in d — 1 + 1 
has considerable constructive clout is “modular 
localization.” It is a consequence of the above 
algebraic setting if either the net of algebras have 
pointlike field generators, or if the one-particle 
masses are separated by spectral gaps so that the 
formalism of time-dependent scattering can be 
applied (Schroer 2005); in conformal theories, this 
property holds automatically in all spacetime 
dimensions. It rests on the basic observation 
(Tomita-Takesaki Modular Theory) that a standard 
pair (A,Q) of a von Neumann operator algebra and 
a standard vector (standardness means that the 
operator algebra of the pair (A, Q) acts cyclic and 
separating on the vector (2) gives rise to a Tomita 
operator $ through its star-operation whose polar 
decomposition yield two modular objects, a one- 
parametric subgroup A" of the unitary group of 
operators in Hilbert space whose Ad-action defines 
the modular automorphism of (.A, €) whereas the 
angular part / is the modular conjugation which 
maps A into its commutant A’ 


SAQ = A*Q, S = JA! 
Jw = U(jw) = SscarJo, A = U(Aw(2nt)) [1] 
ow(t):= AdAí, 


The standardness assumption is always satisfied for 
any field-theoretic pair (A(O),Q) of a O-localized 
algebra and the vacuum state (as long as O has a 
nontrivial causal disjoint O”), but it is only for the 
wedge region W that the modular objects have a 
physical interpretation in terms of the global 
symmetry group of the vacuum as specified in the 
second line of [1]; the modular unitary Af, 
represents the W-associated boost Aw(x) and the 
modular conjugation /w implements the TCP-like 
reflection along the edge of the wedge (Bisognano 


and Wichmann 1975). The third line is the defini- 
tion of the modular group. The importance of this 
theory for local quantum physics results from the 
fact that it leads to the concept of modular 
localization, an intrinsic new scenario for field- 
theoretic constructions which is different from the 
Lagrangian quantization schemes (Schroer 2005). 

A special feature of d= 1 + 1 Minkowski spacetime 
is the disconnectedness of the right/left spacelike region 
leading to a right-left ordering structure. So in addition 
to the Lorentz-invariant timelike ordering x < y (x 
earlier than y, which is independent of spacetime 
dimensions), there is an invariant spacelike ordering 
x « y (x to the left of y) in d — 1 4- 1 which opens the 
possibility of more general Lorentz-invariant spacelike 
commutation relations than those implemented by 
Bose/Fermi fields (Rehren and Schroer 1987) of fields 
with a spacelike braid group commutation structure. 
The appearance of such exotic statistics fields is not 
compatible with their Fourier transforms being crea- 
tion/annihilation operators for Wigner particles; 
rather, the state vectors which they generate from the 
vacuum contain in addition to the one-particle 
contribution a vacuum polarization cloud (Schroer 
2005). This close connection between new kinematic 
possibilities and interactions is one of the reasons why, 
(different from higher dimensions where interactions 
are prescribed by the recipe of local couplings of free 
fields) low-dimensional QFT offers a more intrinsic 
access to the central issue of interactions. 


Boson/Fermion Equivalence and 
Superselection Theory in a Special Model 


The simplest and oldest but conceptually still rich 
model is obtained, as first proposed by Jordan 
(1937), by using a two-dimensional massless Dirac 
current and showing that it may be expressed in 
terms of scalar canonical Bose creation/annihilation 
operators 


lu =: V^ p Ob, o 


p: ™ px * dp 
| J. (Map) + heh xe 2 
Although the potential ó(x) of the current as a result 
of its infrared divergence is not a field in the 
standard sense of an operator-valued distribution 
in the Fock space of the a(p)* (It becomes an 
operator after smearing with test functions whose 
Fourier transform vanishes at p=0), the formal 
exponential defined as the zero-mass limit of a well- 
defined exponential free massive field 

: eit). — lim mé? . e'aó, x) : [3] 


m—( 
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turns out to be a bona fide quantum field in a larger 
Hilbert space (which extends the Fock space 
generated from applying currents to the vacuum). 
The power in front is determined by the requirement 
that all Wightman functions (computed with the 
help of free-field Wick combinatorics) stay finite in 
this massless limit; the necessary and sufficient 
condition for this is the charge conservation rule 


li ei*io(x) . 
i 


E (1/2)a;0j [4] 
SG EE . Eo LI 
= Mar) j 


0, otherwise 


where the resulting correlation function has been 
factored in terms of light-ray coordinates 
Esp = Nig —X4,X4—143x, and the -prescription 
stands for taking the standard Wightman bound- 
ary value 1 — t+ ie, lim-_,9 which insures the 
positive-energy condition. The finiteness of the 
limit insures that the resulting zero-mass limiting 
theory is a bona fide quantum field theory that is, 
its system of Wightman functions permits the 
construction of an operator theory in a Hilbert 
space with a distinguished vacuum vector. 

The factorization into light-ray components [4] 
shows that the exponential charge-carrying opera- 
tors inherit this factorization into two independent 
chiral components :exp ia ó(x):— exp ia (x+): 
:exp la@_(x_):, each one being covariant under 
scaling €— A£ if one assigns the scaling dimension 
d — o^ /2 to the chiral exponential field and d — 1 to 
the current. As any Wightman field, this is a singular 
object which only after smearing with Schwartz test 
functions yields an (unbounded) operator. But the 
above form of the correlation function belongs to a 
class of distributions which admits a much larger 
test-function space consisting of smooth functions 
which instead of decreasing rapidly only need to be 
bounded so that they stay finite on the compactified 
light-ray line R=S'. To make this visible, one uses 
the Cayley transform (now x denotes either x, 
Or Xx...) 


EF 
— T e gi 
1l-—1* 


z [5] 
This transforms the Schwartz test function into a space 
of test functions on $! which have an infinite order 
zero at z= — 1 (corresponding to x — coo) but the 
rotational transformed fields j(z), : exp ia ó(z): permit 
the smearing with all smooth functions on $!, a 
characteristic feature of all conformal invariant 
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theories as the present one turns out to be. There is an 
additional advantage in the use of this compactifica- 
tion. Fourier transforming the circular current actually 
allows for a quantum-mechanical zero mode whose 
possible nonzero eigenvalues indicate the presence of 
additional charge sectors beyond the charge-zero 
vacuum sector. For the exponential field, this leads to 
a quantum-mechanical pre-exponential factor which 
automatically insures the charge selection rules so that 
unrestricted (by charge conservation) Wick contrac- 
tion rules can be applied. In this approach, the 
original chiral Dirac fermion v(x) (from which the 
current was formed as the :vyv: composite) 
reappears as a charge-carrying exponential field 
for a=1 and thus illustrates the meaning of 
bosonization/fermionization. (It is interesting to 
note that Jordan's (1937) original treatment of 
fermionization had such a pre-exponential quan- 
tum-mechanical factor.) Naturally, this terminol- 
ogy has to be taken with a grain of salt in view of 
the fact that the bosonic current algebra only 
generates a superselected subspace into which the 
charge-carrying exponential field does not fit. 
Only in the case of massive two-dimensional QFT 
fermions can be incorporated into a Fock space of 
bosons (see last section). At this point, it should 
however be clear to the reader that the physical 
content of Jordan's paper had nothing to do with 
its misleading title *neutrino theory of light" but 
rather was an early illustration about charge 
superselection rules in two-dimensional QFT. 

A systematic and rigorous approach consists in 
solving the problem of positive-energy representa- 
tion theory for the Weyl algebra on the circle (which 
is the rigorous operator-algebraic formulation of the 
abelian current algebra). (The Weyl algebra origi- 
nated in quantum mechanics around 1927; its use in 
QFT only appeared after the cited Jordan paper. By 
representation we mean here a regular representa- 
tion in which the exponentials can be differentiated 
in order to obtain (unbounded) smeared current 
operators.) It is the operator algebra generated by 
the exponential of a smeared chiral current (always 
with real test functions) with the following relation 
between the generators 


W(f) = eth) 
dz j 
Kf) = | zI, ir) 
==0 (z= z) [6a] 
W(f)W(g) = e 0/2569 W(f + g) 
W*(f) = wf) [6b] 


A(S') = alg( W(f),f € C..(S')} 
A(I) = alg( W(f), suppf C I} [6c] 


where 


so) = [Fx 


is the symplectic form which characterizes the 
Weyl algebra structure and [6c] denotes the 
oan a C* algebra generated by the unitary objects 

W(f). A particular representation of this algebra is 
given by assigning the vacuum state to the 
generators (Wif))o =e UM, [AG — 3... mal. 
Starting with the vacuum Hilbert space represen- 
tation .A(S!)g =ro(A(S*)), one easily checks that 
the formula 


(W(f)), = e"^ (W(f))o [7a] 


To(W(f)) = P ro(W(F)) [7b] 


defines a state with positive energy, that is, one 
whose GNS representation for œ #0 is unitarily 
inequivalent to the vacuum representation. Its 
incorporation into the vacuum Hilbert space [7b] is 
part of the DHR formalism. It is convenient to view 
this change as the result of an application of an 
automorphism ^, on the C*-Weyl algebra .A(S!) 
which is implemented by a unitary charge-generat- 
ing operator I, in a larger (nonseparable) Hilbert 
space which contains all charge sectors Ha — l', Ho, 


Ho = Hy = A(S*)0: 
(WP) = va (WIN A 
tal W(f)) =T WAAT; 


P,Q =, describes a state with a rotational homo- 
geneous charge distribution; arbitrary charge distribu- 
tions pa of total charge a that is, f (dz/ 27i)p, — a 
are obtained in the form 


ys = (Pa) W (PS )Ta [2] 


where n(p.) is a numerical phase factor and the 
net effect of the Weyl operator is to change the 
rotational homogeneous charge distribution into 
pa. The necessary charge-neutral compensating 
function p$ in the Weyl cocycle W(p$) is uniquely 
determined in terms of p, up to the choice of one 
point CES! (the determining equation involves 
the Inz function which needs the specification of 
a branch cut (Schroer 2005)). From this formula, 
one derives the commutation relations vo Y= = 
e*tait Ve for spacelike separations of the p 
supports; hence, these fields are relatively local 
(bosonic) for o8 — 2Z. In particular, if only one 
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type of charge is present, the generating charge is 
Agen = V2N and the composite charges are multi- 
ples, that is, QgnZ. This locality condition 
providing bosonic commutation relations does 
not yet ensure the C-independence. Since the 
equation which controls the ¢-change turns out 
to be 


ye (vé je etita ¿2ri0a 10] 


one achieves C-independence by restricting the 
Hilbert space charges to be “dual” to that of the 
operators, that is, 


o- nt 


The localized 1/5 operators acting on the restricted 


separable Hilbert space H,4 generate a ¢-indepen- . 


dent extended observable algebra .Ax(S!) (Schroer) 
and it is not difficult to see that its representation in 
Hes is reducible and that it decomposes into 2N 
charge sectors 


[pnmo Ot. N- 1] 

V2N 

Hence, the process of extension has led to a charge 
quantization with a finite (“rational”) number of 
charges relative to the new observable algebra which 
is neutral in the new charge counting 


1 
Zi] gent = Z| Ave 


= Zan 

Agen 
The charge-carrying fields in the new setting are also 
of the above form [9], but now the generating field 
carries the charge 


dz 
Iz Pgen = Ogen 


which is a (1/2N) fraction of the old Qgen. Their 
commutation relations for disjoint charge supports 
are “braidal” (or better “plektonic” which is more 
on par with being bosonic/fermionic). (In the abelian 
case like the present, the terminology “anyonic” 
enjoys widespread popularity, but in the present 
context the “any” does not go well with charge 
quantization.) These objects considered as operators 
localized on S! do depend on the cut C, but using an 
appropriate finite covering of S! this dependence is 
removed (Schroer 2005). So the field algebra Fz2Nn 
generated by the charge-carrying fields (as opposed 
to the bosonic observable algebra An) has its unique 
localization structure on a finite covering of S'. An 
equivalent description which gets rid of C consists in 
dealing with operator-valued sections on $!. The 


extension A— Ay, which renders the Hilbert space 
separable and quantizes the charges, seems to be 
characteristic for abelian current algebra; in all other 
models which have been constructed up to now the 
number of sectors is at least denumerable and in the 
more interesting ones even finite (rational models). 
An extension is called maximal if there exists no 
further extension which maintains the bosonic 
commutation relation. For the case at hand, this 
would require the presence of another generating 
field of the same kind as above, which belongs to an 
integer N' is relatively local to the first one. This is 
only possible if N is divisible by a square. 

In passing, it is interesting to mention a somewhat 
unexpected relation between the Schwinger model, 
whose charges are screened, and the Jordan model. 
Since the Lagrangian formulation of the Schwinger 
model is a gauge theory, the analog of the four- 
dimensional *asymptotic freedom" wisdom would 
suggest the possibility of “charge liberation” in the 
short-distance limit of this model. This seems to 
contradict the statement that the intrinsic content 
of the Schwinger model (QED, with massless 
Fermions) (after removing a classical degree of free- 
dom) is the QFT of a free massive Bose field and such a 
simple free field is at first sight not expected to contain 
subtle information about asymptotic charge liberation. 
(In its original gauge-theoretical form, the Schwinger 
model has an infinite vacuum degeneracy. The 
removal of this degeneracy (restoration of the cluster 
property) with the help of the “0-angle formalism” 
leaves a massive free Bose field (the Schwinger—Higgs 
mechanism). As expected in d=1 + 1 the model only 
possesses this phase.) Well, as we have seen above, the 
massless limit really does have liberated charges and 
the short-distance limit of the massive free field is the 
massless model (Schroer). 

As a result of the peculiar bosonization/fermioniza- 
tion aspect of the zero-mass limit of the derivative of 
the massive free field, Jordan’s model is also closely 
related to the massless Thirring model (and the related 
Luttinger model for an interacting one-dimensional 
electron gas) whose massive version is in the class of 
factorizing models (see later section). (Another struc- 
tural consequence of this aspect leads to Coleman’s 
theorem (Schroer 2005) which connects the Mermin- 
Wagner no-go theorem for two-dimensional sponta- 
neous continuous symmetry breaking with these 
zero-mass peculiarities.) The Thirring model is a 
special case in a vast class of “generalized” multi- 
coupling multicomponent Thirring models, that is, 
models with 4-fermion interactions. Under this name 
they were studied in the early 1970s (Schroer) with 
the aim to identify massless subtheories for which the 
currents form chiral current algebras. 
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The counterpart of the potential of the conserved 
Dirac current in the massive Thirring model is the 
sine-Gordon field, that is, a composite field which in 
the attractive regime of the Thirring coupling again 
obeys the so-called sine-Gordon equation of motion. 
Coleman gave a supportive argument (Schroer 2005) 
but some fine points about the range of its validity in 
terms of the coupling strength remained open. (It was 
noticed that the current potential of the free massive 
Dirac Fermion (g — 0) does not obey the sine-Gordon 
equation (Schroer 2005).) A rigorous confirmation of 
these facts was recently given in the bootstrap form- 
factor setting (Schroer 2005). Massive models which 
have a continuous or discrete internal symmetry have 
“disorder” fields which implement a “half-space” 
symmetry on the charge-carrying field (acting as the 
identity in the other half-axis) and together with the 
basic pointlike field form composites which have 
exotic commutation relations (see the last section). 


The Conformal Setting, 
Structural Results 


Chiral theories play a special role within the setting 
of conformal quantum fields. General conformal 
theories have observable algebras which live on 
compactified Minkowski space (S! in the case of 
chiral models) and fulfill the Huygens principle, 
which in an even number of spacetime dimension 
means that the commutator is only nonvanishing for 
lightlike separation of the fields. The fact that this 
classically expected behavior breaks down for 
nonobservable conformal fields (e.g., the massless 
Thirring field) was noticed at the beginning of the 
1970s and considered paradoxical at that time 
(“reverberation” in the timelike (Huygens) region). 
Its resolution around 1974-75 confirmed that such 
fields are genuine conformal covariant objects but 
that some fine points about their causality needed to 
be addressed. The upshot was the proposal of two 
different but basically equivalent concepts about 
globally causal fields. They are connected by the 
following global decomposition formula: 


Xcov) =} Ag,a( g(x ' (x) 


= P,A(x)P3, Z = n a [11] 


On the left-hand side, the spacetime point of the 
field is a point on the universal covering of the 
conformal compactified Minkowski space. These are 
fields (Lüescher and Mack 1975) (Schroer 2005) 
which *live" in the sense of quantum (modular) 
localization on the universal covering spacetime (or 
on a finite covering, depending on the “rationality” 


of the model) and fulfill the global causality 
condition previously discovered by I Segal (Schroer 
2005). They are generally highly reducible with 
respect to the center of the covering group. The 
family of fields on the right-hand side, on the other 
hand, are fields which were introduced (Schroer and 
Swieca 1974; Schroer et al. 1975) with the aim to 
have objects which live on the projection x(X¿0y), 
that is, on the spacetime of the physics laboratory 
instead of the “hells and heavens” of the covering 
(Schroer 2005). They are operator-distributional 
valued sections in the compactification of ordinary 
Minkowski spacetime. The connection is given by 
the above decomposition formula into irreducible 
conformal blocks with respect to the center Z of the 
noncompact covering group SO(2,7) where a, 3 are 
labels for the eigenspaces of the generating unitary Z 
of the abelian center Z. The decomposition |11] is 
minimal in the sense that in general there generally 
will be a refinement due to the presence of 
additional charge superselection rules (and internal 
group symmetries). The component fields are not 
Wightman fields since they annihilate the vacuum if 
the right-hand projection differs from Po = Pyac. 

Note that the Huygens (timelike) region in Min- 
kowski spacetime has a timelike ordering structure 
x=<y or x> y (earlier or later). In d=1 +1, the 
topology allows in addition a spacelike left-right 
ordering x S y. In fact, it is precisely the presence of 
these two orderings in conjunction with the factor- 
ization of the vacuum symmetry group SO(2, 2) e 
PSL(2R), Q PSL(2, R),, in particular Z=Z)® Z,, 
which is at the root of a significant dinphification. 
This situation suggested a tensor factorization into 
chiral components and led to an extremely rich and 
successful construction program of two-dimensional 
conformal QFT as a two-step process: the classifica- 
tion of chiral observable algebras on the light ray and 
the amalgamation of left-right chiral theories to two- 
dimensional local conformal QFT. The action on the 
circular coordinates z is through fractional SU(1, 1) 
transformations 


OZ + B 
g(z)= frr 
Í Z TO 


whereas the covering group acts on the Mack- 
Luescher covering coordinates. 

The presence of an ordering structure permits the 
appearance of more general commutation relations 
for the above Aag component fields namely 


Aa a(x)Bg (y) 
= 2, Rag Bos y (y) Aza (x), 


x>y [12] 


with numerical R-coefficients which, as a result of 
associativity and relative commutativity with respect 
to observable fields, have to obey certain structure 
relations; in this way, Artin braid relations emerge 
as a new manifestation of the Einstein causality 
principle for observables in low-dimensional QFT 
(Rehren and Schroer 1989) (see Schroer 2005). 
Indeed, the DHR method to interpret charged fields 
as charge superselection carriers (tied by local 
representation theory to the bosonic local structure 
of observable algebras) leads precisely to such a 
plektonic statistics structure (Fredenhagen et al. 
1992, Gabbiani and Froehlich 1993) for systems in 
low spacetime dimension (see Symmetries in Quan- 
tum Field Theory of Lower Spacetime Dimensions). 
With an appropriately formulated adjustment to 
observables fulfilling the Huygens commutativity, 
this plektonic structure (but now disconnected from 
particle/field statistics) is also a possible manifesta- 
tion of causality for the higher-dimensional timelike 
structure (Schroer 2005). 

The only examples known up to the appearance 
of the seminal BPZ work (Belavin et al. 1984) were 
the abelian current models of the previous section 
which furnish a rather poor man’s illustration of the 
richness of the decomposition theory. The flood- 
gates of conformal QFT were only opened after the 
BPZ discovery of “minimal models,” which was 
preceded by the observation (Friedan et al. 1984) 
that the algebra of the stress-energy tensor came 
with a new representation structure which was not 
compatible with an underlying internal group 
symmetry (see Symmetries in Quantum Field The- 
ory: Algebraic Aspects). 

An important step in the structural study of chiral 
models was the recognition that the energy-momen- 
tum tensor has the commutation structure of a Lie 
field (Schroer 2005); in the next section, its algebraic 
structure and its representation theory will be 
presented. 


Chiral Fields and Two-Dimensional 
Conformal Models 


Let us start with a family which generalizes the 
abelian model of the previous section. Instead of a 
one-component abelian current we now take n 
independent copies. The resulting multicomponent 
Weyl algebra has the previous form except that the 
current is 77-component and the real function space 
underlying the Weyl algebra consists of functions 
with values in an z-component real vector space 
f € LV with the standard Euclidean inner product 
denoted by (,). The local extension now leads to 
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(a, B) € 2Z, that is, an even-integer lattice £ in V, 
whereas the restricted Hilbert subspace Hp- which 
ensures (-independence is associated with the dual 
lattice L*: (A05) —ójg which contains £. The 
resulting superselection structure (i.e. the O- 
spectrum) corresponds to the finite factor group 
L*/L. For self-dual lattices £' =£ (which only can 
occur if dimV is a multiple of 8), the resulting 
observable algebra has only the vacuum sector; the 
most famous case is the Leech lattice A4 in 
dim V —24, also called the “moonshine” model. 
The observation that the root lattices of the Lie 
algebras of types A, B, or E (e.g., su(n) corre- 
sponding to A, 4) also appear among the even- 
integral lattices suggests that the  nonabelian 
current algebras associated to those Lie algebras 
can also be implemented. This turns out to be 
indeed true as far as the level-1 representations are 
concerned which brings us to the second family: 
the nonabelian current algebras of level k asso- 
ciated to those Lie algebras; they are characterized 
by the commutation relation 


Ua (2).Ja(2^)] =] CIL ES = z) 
= tkgagð (z = g) [13] 


where fl, are the structure constants of the under- 
lying Lie algebra, g their Cartan-Killing form, and 
k, the level of the algebra, must be an integer in 
order that the current algebra can be globalized to a 
loop group algebra. The Fourier decomposition of 
the current leads to the so-called affine Lie algebras, 
a special family of Kac-Moody algebras. For & — 1, 
these currents can be constructed as bilinears in 
terms of the multicomponent chiral Dirac field; 
there exists also the mentioned possibility to obtain 
them by constructing their maximal Cartan currents 
within the above abelian setting and representing the 
remaining nondiagonal currents as certain charge- 
carrying (*vertex" algebra) operators. Level-k alge- 
bras can be constructed from reducing tensor 
products of k level-1 currents or directly via the 
representation theory of infinite-dimensional affine 
Lie algebras. (The global exponentiated algebras 
(the analogs to the Weyl algebra) are called loop 
group algebras.) Either way one finds that, for 
example, the SU(2) current algebra of level & has 
(together with the vacuum sector) k+1 sectors 
(inequivalent representations). The different sectors 
are already distinguished by the structure of their 
ground states of the conformal Hamiltonian Lo. 
Although the computation of higher point correla- 
tion functions for k > 1, there is no problem in 
securing the existence of the algebraic nets which 
define these chiral models as well as their k+1 
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representation sectors and to identify their generat- 
ing charge-carrying fields (primary fields) including 
their R-matrices appearing in their plektonic com- 
mutation relations. It is customary to use the 
notation SU(2), for the abstract operator algebras 
associated with the current generators [13] and we 
will denote their k+ 1 equivalence classes of 
representations by .4su,,5,7 — 0,...,k, whereas 
representations of current algebras for higher rank 
groups require a more complicated labeling (in 
terms of Weyl chambers). 

The third family of models are the so-called 
minimal models which are associated with the 
Lie-field commutation structure of the chiral 
stress—energy tensor which results from the chiral 
decomposition of a conformally covariant two- 
dimensional stress—energy tensor 


[T(z), T(z)] = i(T(z) + T(z))6 (« — z') 
ic 


= 247 


6" (z "^ z) [14] 
whose Fourier decomposition yields the Witt- 
Virasoro algebra, that is, a central extension of 
the Lie algebra of the Diff(S!). (The presence of 
the central term in the context of QFT (the analog 
of the Schwinger term) was noticed later; however, 
the terminology Witt-Virasoro algebra in the 
physics literature came to mean the Lie algebra 
of diffeomorphisms of the circle including the 
central extension.) The first two coefficients are 
determined by the physical role of T(z) as the 
generating field density for the Lie algebra of the 
Poincaré group whereas the central extension 
parameter c > 0 (positivity of the two-point func- 
tion) for the connection with the generation of the 
Moebius transformations and the undetermined 
parameter c > 0 (the central extension parameter) 
is easily identified with the strength of the two- 
point function. Although the structure of the 
T-correlation functions resembles that of free 
fields (in the sense that is an algebraically 
computable unique set of correlation functions 
once one has specified the two-point function), the 
realization that c is subject to a discrete quantiza- 
tion if c« 1 came as a surprise. As already 
mentioned, the observation that the superselection 
sectors (the positive-energy representation struc- 
ture) of this algebra did not at all follow the logic 
of a representation theory of an inner symmetry 
group generated a lot of attention and stimulated a 
flurry of publications on symmetry concepts 
beyond groups (quantum groups). A concept of 
fundamental importance is the DHR theory of 
localized endomorphisms of operator algebras and 


the concept of operator-algebraic inclusions (in 
particular, inclusions with conditional expectations — 
V Jones inclusions). 

The SU(2), current coset construction (Goddard 
et al. 1985) revealed that the proof of existence and 
the actual construction of the minimal models is 
related to that of the SU(2), current algebras. 
Constructing a chiral model does not necessarily 
mean the explicit determination of the n-point 
Wightman functions of their generating fields 
(which for most chiral models remains a prohibi- 
tively complicated task) but rather a proof of their 
existence by demonstrating that these models are 
obtained from free fields by a series of computa- 
tional complicated but mathematically controlled 
operator-algebraic steps as reduction of tensor 
products, formation of orbifolds under group 
actions, coset constructions, and a special kind of 
extensions. The generating fields of the models are 
nontrivial in the sense of not obeying free-field 
equations (i.e., not being “on-shell”). The cases 
where one can write down explicit z-point functions 
of generating fields are very rare; in the case of the 
minimal family this is limited to the field theory of 
the Ising model (Schroer 2005). 

To show the power of inclusion theory for the 
determination of the charge content of theory, let us 
look at a simple illustration in the context of the above 
multicomponent abelian current algebra. The vacuum 
representation of the corresponding Weyl algebra is 
generated from smooth V-valued functions on the 
circle modulo constant functions (i.e., functions with 
vanishing total integral) f € LVo. These functions 
equipped with the aforementioned complex structure 
and scalar product yield a Hilbert space. The 
I-localized subalgebra is generated by the Weyl image 
of I-supported functions (class functions whose repre- 
senting functions are constant in the complement 1’) 


A(I) :=alg{W(f)|f € K(1)) 
K(I) = (f € LVo|f = const.in I’} 


The one-interval Haag duality A(I —.A(I/) (the 
commutant algebra equals the algebra localized in 
the complement) is simply a consequence of the fact 
that the symplectic complement K(I) in terms of 
Im(f, g) consists of real functions in that space which 
are localized in the complement, that is, 
K(I)' =K(1'). The answer to the same question for 
a double interval 1 — I4 UJ; (think of the first and 
third quadrant on the circle) does not lead to duality 
but rather to a genuine inclusion 


[15] 


K((I UB) =K(b U l4) c K(i U I3)! 


o, [16 
K(11 UB) C K((11UL)) 
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The meaning of the left-hand side is clear; these 
are functions which are constant in I1 UJ; with the 
same constant in the two intervals whereas the 
functions on the right-hand side are less restrictive 
in that the constants can be different. The 
conversion of real subspaces into von Neumann 
algebras by the Weyl functor leads to the algebraic 
inclusion .A(I4U I3) C.A(4401I3)). In physical 
terms, the enlargement results from the fact that 
within the charge neutral vacuum algebra a charge 
split with one charge in J; and the compensating 
charge in h for all values of the (unquantized) 
charge occurs. A more realistic picture is obtained 
if one allows a charge split to be subjected to a 
charge quantization implemented by a lattice 
condition f(I) —f(I4) € 27L which relates the 
two multicomponent constant functions (where 
f(I) denotes the constant value f takes in I). As 
in the previous one-component case, the choice of 
even lattices corresponds to the local (bosonic) 
extensions. Although imposing such a lattice 
structure destroys the linearity of the K, the 
functions still define Weyl operators which gener- 
ated operator algebras .Aj (I; UL). (The linearity 
structure is recovered on the level of the operator 
algebra.) But now the inclusion involves the dual 
lattice L* (which of course contains the original 
lattice), 


Aj (I1 UL) C Ar (11 Uh) 
ind{ A; (1; UT) C Ar((H Uh)')'} = |G| 
Aj (I U I>) _ invo Ar: (Ti U I5) 


This time the possible charge splits correspond to 
the factor group G— L*/L, that is, the number of 
possibilities is |G| which measures the relative size 
of the bigger algebra in terms of the smaller. This is 
a special case of the general concept of the so-called 
Jones index of an inclusion which is a numerical 
measure of its depth. A prerequisite is that the 
inclusion permits a conditional expectation which 
is a generalization of the averaging under the 
“gauge group" G on .Ar.(Il,UI5) in the third 
equation above, which identifies the invariant 
smaller algebra with the fix-point algebra (the 
invariant part) under the action of G. In fact, 
using the conceptual framework of Jones, one can 
show that the two-interval inclusion is independent 
of the position of the disjoint intervals character- 
ized by the group G. 

There exists another form of this inclusion which 
is more suitable for generalizations. One starts from 
the charge quantized extended local algebra 45" > 
A described earlier in terms of an even-integer lattice 
L (which lives in the separable Hilbert space Hj.) as 


our observable algebra. Again the Haag duality is 
violated and converted into an inclusion A7" (I U In) 
C AP (I1 U I5))" which turns out to have the same 
G= L*/L charge structure (it is in fact isomorphic 
to the previous inclusion). In the general setting 
(current algebras, minimal model algebras, ...), this 
double interval inclusion is particularly interesting if 
the associated Jones index is finite. One finds 
Kawahigashi et al. (2001) (Schroer 2005). 


Theorem 1 A chiral theory with finite Jones index 
u=ind{A((I; U D)) :.A(4015)) for tbe double 
interval inclusion (always assuming that A(S') is 
strongly additive and split) is a rational theory and 
the statistical dimensions d, of its charge sectors are 
related to y through the formula 


y= y d [17] 
p 


Instead of presenting more constructed chiral 
models, it may be more informative to mention 
some of the algebraic methods by which they are 
constructed and explored. The already mentioned 
DHR theory provides the conceptual basis for 
converting the notion of positive-energy represen- 
tation sectors of the chiral model observable 
algebras A (equivalence classes of unitary repre- 
sentations) into localized endomorphisms p of this 
algebra. This is an important step because con- 
trary to group representations which have a 
natural tensor product composition structure, 
representations of operator algebras generally do 
not come with a natural composition structure. 
The DHR endomorphisms theory of A leads to 
fusion laws and an intrinsic notion of generalized 
statistics (for chiral theories: plektonic in addition 
to bosonic/fermionic). The chiral statistics para- 
meters are complex numbers (Haag 1992) whose 
phase is related to a generalized concept of spin 
via a spin-statistics theorem and whose absolute 
value (the statistics dimension) generalized the 
notion of multiplicities of fields known from the 
description of inner symmetries in higher-dimen- 
sional standard QFTs. The different sectors may 
be united into one bigger algebra called the 
exchange algebra F,.g in the chiral context (the 
“reduced field bundle” of DHR) in which every 
sector occurs by definition with multiplicity 1 and 
the statistics data are encoded into exchange 
(commutation) relations of charge-carrying opera- 
tors or generating fields (“exchange algebra 
fields”) (Schroer 2005). Even though this algebra 
is useful in that all properties concerning fusion 
and statistics are nicely encoded, it lacks some 
cherished properties of standard field theory 
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namely there is no unique state-field relation, that 
is, no Reeh-Schlieder property (a field Aag whose 
source projection Pz does not coalesce with the 
vacuum projection annihilates the vacuum); in 
operator-algebraic terms, the local algebras are 
not factors. This poses the question of how to 
manufacture from the set of all sectors natural 
(not necessarily local) extensions with these 
desired properties. It was found that this problem 
can be characterized in operator-algebraic terms 
by the existence of the so-called DHR triples 
(Schroer). In case of rational theories, the number 
of such extensions is finite and in the aforemen- 
tioned “classical” current algebra and minimal 
models they all have been constructed by this 
method (thus confirming existing results complet- 
ing the minimal family by adding some missing 
models). The same method adapted to the chiral 
tensor product structure of d—1--1 conformal 
observables classifies and constructs all two-dimen- 
sional local (bosonic/ fermionic) conformal QFT 5; 
which can be associated with the observable chiral 
input. It turns out that this approach leads to 
another of those pivotal numerical matrices which 
encode structural properties of QFT: the coupling 
matrix Z, 


A®AC B; 
Y Z, pA) SIA) C AG A [18] 
po 


where the second line is an inclusion solely 
expressed in terms of observable algebras from 
which the desired (isomorphic) inclusion in the first 
line follows by a canonical construction, the so- 
called Jones basic construction. The numerical 
matrix Z is an invariant closely related to the so- 
called “statistics character matrix” (Schroer 2005) 
and in case of rational models it is even a modular 
invariant with respect to the modular SL(2, Z) group 
transformations (which are closely related to the 
matrix $ in the final section). 


Integrability, the Bootstrap 
Form-Factor Program 


Integrability in QFT and the closely associated 
bootstrap form-factor construction of a very rich 
class of massive two-dimensional QFTs can be 
traced back to two observations made during the 
1960s and 1970s ideas. On the one hand, there was 
the time-honored idea to bypass the “off-shell” field- 
theoretic approach to particle physics in favor of a 
pure on-shell S-matrix setting which (in particular 
recommended for strong interactions), as a result of 


the elimination of short distances via the mass-shell 
restriction, would be free of ultraviolet divergencies. 
This idea was enriched in the 1960s by the crossing 
property which in turn led to the bootstrap idea, a 
highly nonlinear seemingly self-consistent proposal 
for the determination of the S-matrix. However, the 
protagonists of this S-matrix bootstrap program 
placed themselves into a totally antagonistic fruitless 
position with respect to QFT so that the strong 
return of QFT in the form of gauge theory under- 
mined their credibility. On the other hand, there 
were rather convincing quasiclassical calculations in 
certain two-dimensional massive QFTs as, for 
example, the sine-Gordon model which indicated 
that the obtained quasiclassical mass spectrum is 
exact and hence suggested that the associated 
QFTs are integrable (Dashen et al. 1975) and 
have no real particle creation. These provocative 
observations asked for a structural explanation 
beyond quasiclassical approximations, and it soon 
became clear that the natural setting for obtain- 
ing such mass formulas was that of the “fusion” 
of boundstate poles of unitary crossing-symmetric 
purely elastic S-matrices; first in the special 
context of the sine-Gordon model (Schroer et al. 
1976) and later as a classification program from 
which factorizing S-matrices can be determined 
by solving well-defined equations for the elastic 
two-particle S-matrix (Karowski et al. 1977). 
(It was incorrectly believed that the “nontrivial 
elastic scattering implies particle creation” 
statement of Aks (Aks, 1963) is also valid for 
low-dimensional QFTs.) Some equations in this 
bootstrap approach resembled mathematical 
structures which appeared in C N Yang’s work 
on nonrelativistic ó-function particle interactions 
as well as relations for Boltzmann weights in 
Baxter’s work on solvable lattice models; hence, 
they were referred to as Yang-Baxter relations. 
These results suggested that the old bootstrap 
idea, once liberated from its ideological dead 
freight (in particular from the claim that the 
bootstrap leads to a unique “theory of 
everything" (minus gravity)), generates a useful 
setting for the classification and construction 
of  factorizing two-dimensional  relativistic 
$-matrices. Adapting certain known relations 
between two-particle form factors of field opera- 
tors and the S-matrix to the case at hand 
(Karowski and Weisz 1978), and extending this 
with hindsight to generalized (multiparticle) form 
factors, one arrived at the axiomatized recipes of 
the bootstrap form-factor program of d=1+1 
factorizable models (Smirnov 1992). Although 
this approach can be formulated within the 


setting of the LSZ scattering formalism, the use of 
a certain algebraic structure (Zamolodchikov and 
Zamolodchikov 1979) which in the simplest 
version reads 


Z(0)Z* (0) = S? (9 — 9)Z* (0) Z(0) + 6(0 — 6") 


[19] 
Z(0)Z(0') = S?) (9 — 8)Z(0')Z(0) 


(the ó-term Faddeev is due to Faddeev) brought 
significant simplifications. In the general case, the 
Z's are vector valued and the S'*)-structure function 
is matrix valued. (The identification of the Z-F 
structure coefficients with the elastic two-particle 
S-matrix S® (which is prenempted by our notation) 
can be shown to follow from the physical inter- 
pretation of the Z-F structure in terms of localiza- 
tion.) In that case the associativity of the Z-F 
algebra is equivalent to the Yang- Baxter equations. 
Recently, it became clear that this algebraic relation 
has a deep physical interpretation; it is the simplest 
algebraic structure which can be associated with 
generators of nontrivial wedge-localized operator 
algebras (see the next section). 

Conceptually as well as computationally it is much 
simpler to identify the intrinsic meaning of integr- 
ability in QFT with the factorization of its S-matrix 
or a certain property of wedge-localized algebras 
(see next section) than to establish integrability (see 
Integrability and Quantum Field Theory). 

The first step of the bootstrap form-factor 
program namely the classification and construction 
of model S-matrices follows a combination of two 
patterns: prescribing particle multiplets transforming 
according to group symmetries and/or specifying 
structural properties of the particle spectrum. The 
simplest illustration for the latter strategy is supplied 
by the Zyn model. In terms of particle content, Zy 
demands the identification of the Nth bound state 
with the antiparticle. Since the fusion condition for 
the bound mass m; = (pı + pi — mi +m 4- 2m| 
m»cb(0; — 05) is only possible for a pure imaginary 
rapidity difference 01; —04 — 0; —io (“binding 
angle"). Hence, the binding of two “elementary” 
particles of mass m gives 


sin 2a 


Ma = m — 
sin Q 


and more generally of k particles with 


sin ka 


Mp — M— 
sin a 

so that the antiparticle mass condition my — 7 — m 
fixes the binding angle to a — 27/N. (The quotation 
mark is meant to indicate that in contrast to the 
Schródinger QM there is “nuclear democracy" on 
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the level of particles. The inexorable presence of 
interaction-caused vacuum polarization limits a 
fundamental/fused hierarchy to the fusion of 
charges.) The minimal (no additional physical 
poles) two-particle S-matrix in terms of which the 
n-particle S-matrix factorizes is therefore 


2) _ sin(1/2)(0 + (2mi)/ N) 
min^ sin(1/2)(0 — (27i)/N) 


(minimal = without so-called CDD poles) The 
SU(N) model as compared with the U(N) model 
requires a similar identification of bound states of 
N — 1 particles with an antiparticle. This S-matrix 
enters as in the equation for the vacuum to 
n-particle meromorphic form factor of local opera- 
tors; together with the crossing and the so-called 
“kinematical pole equation,” one obtains a recursive 
infinite system linking a certain residue with a form 
factor involving a lower number of particles. The 
solutions of this infinite system form a linear space 
from which the form factors of specific tensor fields 
can be selected by a process which is analogous but 
more involved than the specification of a Wick basis 
of composite free fields. Although the statistics 
property of two-dimensional massive fields is not 
intrinsic but a matter of choice, it would be natural 
to realize, for example, the Zy fields as Zy-anyons. 

Another rich class of factorizing models are 
the Toda theories of which the sine-Gordon and 
sinh-Gordon are the simplest cases. For their 
descriptions, the quasiclassical use of Lagrangians 
(supported by integrability) turns out to be of some 
help in setting up their more involved bootstrap 
form-factor construction. 

The unexpected appearance of objects with new 
fundamental (solitonic) charges (e.g., the Thirring 
field as the carrier of a solitonic sine-Gordon charge) 
and the unexpected confinement of charges (e.g., the 
CP(1) model as a confined SU(2) model) turn out to 
be opposite sides of the same coin and both cases 
have realizations in the setting of factorizing models 


(Schroer 2005). 


[20] 


Recent Developments 


There are two ongoing developments which place 
the two-dimensional bootstrap form-factor program 
into a more general setting which permits to under- 
stand its position in the general context of local 
quantum physics. 

One of these starts from the observation that the 
smallest spacetime localization region in which it is 
possible to find vacuum-polarization-free generators 
(PFG) in the presence of interactions is the wedge 
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region. If one demands in addition that these 
generators (necessarily unbounded operators) have 
the standard domain properties of QFT (which 
include stability of the domain under translations), 
then one finds that this leads precisely to the two- 
dimensional Z-F algebraic structure which in turn in 
this way a spacetime interpretation for the first time 
acquires. In these investigations (Schroer 2005), 
modular localization theory plays a prominent role 
and there are strong indications that with these 
methods one can show the nontriviality of intersec- 
tions of wedge algebras which is the algebraic 
criterion for the existence of a model within local 
quantum physics. 

There is a second constructive idea based on light- 
front holography which uses the radical reorganiza- 
tion of spacetime properties of the algebraic structure 
while maintaining the physical content including the 
Hilbert space. Since spacetime localization aspects 
(apart from the remark about wedge algebras and 
their PFG generators made before) are traditionally 
related to the concept of fields, holographic methods 
tend to de-emphasize the particle structure in favor of 
“field properties." Indeed, the transversely extended 
chiral theories which arise as the holographic image 
lead to simplification of many interesting properties 
with very similar aims to the old “light-cone 
quantization" except that light-front holography is 
another way of looking at the original local ambient 
theory without subjecting it to another quantization. 
(The price for this simplification is that as a result of 
the nonuniqueness of the holographic inversion 
certain problems cannot be formulated.) 

Actually, as a result of the absence of a transverse 
direction in the two-dimensional setting, the family 
of factorizing models provides an excellent theore- 
tical laboratory to study their rigorous “chiral 
encoding” which is conceptually very different 
from Zamolodchikov’s perturbative relation (which 
is based on identifying a factorizing model in terms 
of a perturbation on a chiral theory). 

It turns out that the issue of statistics of particles 
loses its physical relevance for two-dimensional 
massive models since they can be changed without 
affecting the physical content. Instead such notions 
as order/disorder fields and soliton take their place 
(Schroer 2005). 

In accordance with its historical origin, the theory 
of two-dimensional factorizing models may also be 
viewed as an outgrowth of the quantization of 
classical integrable systems (Integrability and Quan- 
tum Field Theory). But in comparison with the 
rather involved structure of integrabilty (verifying 
the existence of sufficiently many commuting con- 
servation laws), the conceptual setting of factorizing 


models within the scattering framework (factoriza- 
tion follows from existence of wedge-localized 
tempered PFGs) is rather simple and intrinsic 
(Schroer 2005). 

Among the additional ongoing investigations 
in which the conceptual relation with higher- 
dimensional QFT is achieved via modular localiza- 
tion theory, we will select three which have caught 
our, active attention. One is motivated by the recent 
discovery of the adaptation of Einsteins classical 
principle of local covariance to QFT in curved 
spacetime. The central question raised by this work 
(see Algebraic Approach to Quantum Field Theory) 
is if all models of Minkowski spacetime QFTs 
permit a local covariant extension to curved space- 
time and if not which models do? In the realm of 
chiral QFT, this would amount to ask if all 
Moebius-invariant models are also Diff(S!)-covar- 
iant. It has been known for sometime that a QFT 
with all its rich physical content can be uniquely 
defined in terms of a carefully chosen relative 
position of a finite number of copies of one unique 
von Neumann operator algebra within one common 
Hilbert space. This is a perfect quantum field- 
theoretical illustration for Leibnitz's philosophical 
proposal that reality results from the relative 
position of *monades" (As opposed to the more 
common (Newtonian) view that the material reality 
originates from a material content being placed into 
a spacetime vessel) if one takes the step of identify- 
ing the hyperfinite typ IIl; Murray von Neumann 
factor algebra with an abstract monade from which 
the different copies result from different ways of 
positioning in a shared Hilbert space (Schroer 2005). 
In particular, Moebius-covariant chiral QFTs arise 
from two monades with a joint intersection defining 
a third monade in such a way that the relative 
positions are specified in terms of natural modular 
concepts (without reference to geometry). This begs 
the question whether one can extend these modular- 
based algebraic ideas to pass from the global 
vacuum preserving Moebius invariance to local 
Diff(S) covariance Moeb — Diff(S!). This would 
be precisely the two-dimensional adaptation of the 
crucial problem raised by the recent successful 
generalization of the local covariance principle 
underlying Einstein's classical theory of gravity to 
QFT in curved spacetime: does every Poincaré 
covariant Minkowski spacetime QFT allow a unique 
correspondence with one curved spacetime (having 
the same abstract algebraic substrate but with a 
totally different spacetime encoding)? In the chiral 
context, one is led to the notion of “partially 
geometric modular groups" which only act geome- 
trically if restricted to specific subalgebras (Schroer 


2005). It is hard to imagine how one can combine 
quantum theory and gravity without understanding 
first the still mysterious links between spacetime 
geometry, thermal properties, and relative position- 
ing of monades in a joint Hilbert space. 

A second important umbilical cord with higher- 
dimensional theories is the issue of “Euclideaniza- 
tion” in particular the chiral counterpart of 
Osterwalder-Schrader localization and the closely 
related Nelson-Symanzik duality. In concrete chiral 
models (e.g., the models in the section “Chiral fields 
and two-dimensional conformal models"), it has 
been noted as a result of explicit calculations that 
the analytic continuation in the angular parametri- 
zation for thermal correlation functions leads to 
a duality relation in 


(A(q1 yt On) 20h, 
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where the thermal correlation function is defined as 
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Compared with the thermally extended Nelson- 
Symanzik relation for two-dimensional QFT one 
notices that in addition to the expected behavior of 
real coordinates becoming imaginary and the 
27-periodicity changing role with the (suitably 
normalized) KMS inverse temperature, there is a 
rotation in the space of superselected charges in 
terms of a unitary matrix $ whose origin lies in the 
braid group statistics (the statistics character 
matrix). The deeper structural explanation which 
shows that this relation is not just a property of 
special models, but rather a generic property of 
chiral QFT, comes from a very deep angular 
Euclideanization which is based on modular theory 
(Schroer). Specializing A = identity, one obtains a 
relation for the partition function, the famous 
Verlinde identity which is part of the transformation 
law of the thermal angular correlation functions 
under the SL(2,R) modular group. 

There are many additional important observations 
on factorizing models whose relation to the physical 
principles of QFT, unlike the bootstrap form-factor 
program, is not yet settled. The meaning of the 
c-parameter outside the chiral setting and ideas on 
its renormalization group flow as well as the various 
formulations of the thermodynamic Bethe ansatz 
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belong to a series of interesting observations whose 
final relation to the principles of QFT still needs 
clarification. 


See also: Algebraic Approach to Quantum Field Theory; 
Axiomatic Quantum Field Theory; Bosons and Fermions 
in External Fields; Euclidean Field Theory; Integrablility 
and Quantum Field Theory; Operator Product Expansion 
in Quantum Field Theory; Sine-Gordon Equation; 
Symmetries in Quantum Field Theory: Algebraic Aspects; 
Symmetries in Quantum Field Theory of Lower 
Spacetime Dimensions; Tomita—Takesaki Modular 
Theory. 
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Introduction 


Discovery of the universality phenomenon and the 
underlying renormalization mechanism by Feigen- 
baum and independently by Coullet and Tresser in 
late 1970s was one of the most influential events 
in the dynamical systems theory in the last quarter 
of the twentieth century. It was numerically 
observed that the cascades of doubling bifurca- 
tions leading to chaotic regimes in one-parameter 
families of interval maps, as well as the dynamical 
attractors that appear in the limits, exhibit the 
universal small-scale geometry. To explain this 
surprising observation, a “Renormalization Con- 
jecture" was formulated which asserted that a 
natural renormalization operator acting in the 
space of dynamical systems has a unique hyper- 
bolic fixed point. 

It took about two decades to prove this conjecture 
rigorously (and without the help of computers). The 
proof revealed rich mathematical structures behind 
the universality phenomenon that linked it tightly to 
holomorphic dynamics and conformal and hyper- 
bolic geometry. 

Besides the universality per se, the renormaliza- 
tion theory led to many other important results. 
It includes the proof of the regular or stochastic 
dichotomy that gives us a complete under- 
standing of the real quadratic family (and more 
general families of one-dimensional maps) from 
measure-theoretic point of view, as well as deep 
advances in several key problems of holomorphic 
dynamics. 

Since the original discovery, many other manifes- 
tations of the universality have been observed, 
experimentally, numerically, and theoretically, in 
various classes of dynamical systems. However, in 
this article we will focus on mathematical aspects of 
the original phenomenon. 


General Terminology and Notations 


We will use general notations and terminology from 
Holomorphic Dynamics. 


Unimodal Maps 
Definitions and Conventions 


Let us consider a smooth interval map f : 1 — I. It is 
called unimodal if it has a single critical point c and 
this point is an extremum. We assume that the critical 
point is nondegenerate, unless otherwise it is expli- 
citly stated. A unimodal map is called S-unimodal if it 
has a negative Schwarzian derivative: 


f" 3 (5) 
Sf 2—-—-l-|«0 
f! 2 f! 

For simplicity, we also assume that the map f is 


even, and normalize it so that c — 0 and one of the 
endpoints of I is a fixed point. 


Topological Dynamics 


Let J > 0 be a 0-symmetric periodic interval, that is, 
f^(J) C] for some p € N, such that the intervals 
I]; - f), k-0,1,..., p — 1, have disjoint interiors. 
Then we refer to UJ, asa cycle of intervals of period p. 

According to their topological dynamics, $- 
unimodal maps can be divided into three possible 
types (Sharkovskii, Singer, Guckenheimer, Misiur- 
ewicz, van Strien, Blokh, etc.): 


e Regular maps. Such a map has an attracting or 
parabolic cycle œ. In this case, almost all trajec- 
tories of f converge to @. In case @ is attracting, the 
map / is also called hyperbolic (see Holomorphic 
Dynamics). 

e Topologically chaotic maps. For such a map, 
there is a cycle of intervals UJ, such that the 
restriction f | U Ją is topologically transitive (1.e., it 
has a dense orbit). Moreover, for almost all z € I, 
orbz eventually lands in this cycle. 

e Infinitely renormalizable maps. For such a map, 
there is a nested sequence of periodic intervals 
P>J>---20 of periods p,— oo. Then the 


344 Universality and Renormalization 


intersection of the corresponding cycles of 
intervals, 


oo Pau—l 
A=Ar=() |) AO [1] 
n=0 k=0 
is a Cantor set endowed with a natural group 
structure (inverse limit of cyclic groups Z/p,Z.) 
such that f|A becomes a group translation. 
Moreover, f"z — A for a.e. z € I. This Cantor set 
is also called the Feigenbaum attractor of f. 


Kneading Theory 


Kneading theory (Milnor and Thurston, mid-1970s) 
gives a complete topological classification of S-unimodal 
maps (and more general one-dimensional maps). Let 1, 
and I_ stand for the components of IA (0), where I, 3 
f(0). To any point x € I, let us associate its itinerary 
TOR PN where £&, € [+,—,0), N € Z4 Uoo, in the 
following way. If x is precritical then N € Z, is the 
smallest number such that fx = 0, and we let ey = 0. 
Otherwise, N = oo. For n < N,é,=+ if f"x € I,, and 
En =$ fx EL. 

The kneading sequence of f is the itinerary of the 
critical value f(0). It essentially classifies S-unimodal 
maps: two nonregular S-unimodal maps are topolo- 
gically conjugate if and only if they have the same 
kneading sequence. (In the regular case, one should 
state if the map is hyperbolic or parabolic and 
specify the sign of the multiplier of the correspond- 
ing cycle.) 

The kneading theory completely describes admis- 
sible kneading sequences (realizable by some unim- 
odal maps), and order them linearly in such a way 
that a bigger sequence corresponds to a more 
“complicated” map. The minimal admissible knead- 
ing sequence, + + +, is realized by the parabolic map 
x—x^-F1/4, while the maximal one, + — ===, 
is realized by the Chebyshev map x — x? — 2. 

A central result of the kneading theory is the 
Intermediate Value Theorem asserting that a smooth 
one-parameter family of S-unimodal maps f; con- 
taining two kneading sequences also contains all 
intermediate kneading sequences. In particular, a 
family that contains the above maximal and the 
minimal kneading sequences, contains all admissible 
kneading sequences. Such a family is called full. We 
see that the real quadratic family P.,c € [—2, 1/4], 
is full: any S-unimodal map is topologically equiva- 
lent to some quadratic polynomial. This indicates 
dynamical significance of the quadratic family. 

We say that a one-parameter family of unimodal 
maps f, is almost full if it contains all admissible 
kneading sequences except possibly the minimal one. 


Universality Phenomenon 


Universal Geometry of Doubling Bifurcations 
and the Feigenbaum Attractor 


Let us consider the real quadratic family P, : x — x^ + c, 
c € [-2, 1/4]. As the parameter c moves down from 
1/4, we observe a sequence of doubling bifurcations 
c, Where the attracting cycle of period 2" gives birth 
to an attracting cycle of period 2"*!,5—0,1,... 
(see Holomorphic Dynamics and Figure 1). This 
sequence converges to the Feigenbaum parameter 
C» at exponential rate: c, — c4, ~ A", where A « 
4.6. It turns out that if we consider a similar one- 
parameter family of unimodal maps, say x — a sin x, 
we observe a similar sequence of doubling bifurca- 
tions converging to the limit exponentially at the 
same rate A^", independently of the family under 
consideration. 

In the dynamical space, let us consider the 
Feigenbaum attractor Af [1] of an infinitely renor- 
malizable S-unimodal map f that appear in the limit 
of doubling bifurcations (so that the periods of 
periodic intervals /" are equal to 2"). Let us consider 
the scaling factors o, — |J"|/||" * |. Then oe,— 0%, 
where the limiting scaling factor 04,7: 2.6 is 


C. ——1.38 


Figure 1 Real quadratic family P.: x .— x^ +c. This picture 
presents how the limit set of the orbit (P2(0))5  bifurcates as 
the parameter c changes from 1/4 on the right to —2 on the left. 
Three topological types of regimes are intertwined in an intricate 
way. The gaps correspond to the regular regimes. The black 
regions correspond to the chaotic regimes (though, of course, 
there are many narrow invisible gaps therein). In the beginning 
(on the right) one can see the cascade of doubling bifurcations. 
This picture became symbolic for one-dimensional dynamics. 


independent of the particular map f under considera- 
tion. Thus, the small-scale geometry of A, is 
universal. 

This was historically the first observed manifesta- 
tion of the quantitative universality of dynamical 
and parameter structures. 


Feigenbaum-Coullet-Tresser Renormalization 
Conjecture 


To explain the above universality phenomenon, 
Feigenbaum and independently Coullet and Tresser, 
formulated the following Renormalization Conjec- 
ture. Let us consider the space U of S-unimodal 
maps f:[—1,1]—[-—1,1]. A map f € U is called 
(doubling) renormalizable if it has a cycle 
of intervals J— Jı —J of period 2. Then, for any 
n E€ Z4 U{œ0}, we can naturally define n-times 
renormalizable maps, where n=0 corresponds to 
the non-renormalizable case, while n= corres- 
ponds to the infinitely renormalizable case. 

Let U’ CU be the space of doubling renormaliz- 
able maps. If f € U’ then f? :J — J is an S-unimodal 
map as well, and we define the (doubling) renorma- 
lization operator R :U' — as the rescaling of this 
map: 


Rf (x) — a^! f ^ (ex) 


where c =|J|/2. 
The Renormalization Conjecture asserted that: 


e The renormalization operator R has a unique 
fixed point f,, and this point is hyperbolic; 

e the stable manifold W*(f,) consists of infinitely 
renormalizable unimodal maps; 

e the unstable manifold W"(f,) is one dimensional 
and represents an almost full family of unimodal 
maps (see the section *Kneading theory"); and 

e the quadratic family (P.) transversally intersects 
W*(f,) (see Figure 2). 


Assuming this conjecture, one can see that for any 
curve £g; in U that transversally intersects the 
stable manifold W*(f,) at some moment £,, the 
doubling bifurcations parameters t„ converge to t, at 


Quadratic family 


Figure 2 Renormalization fixed point. 
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exponential rate A”, where A is the unstable 
eigenvalue of the differential DR(f,). This explains 
the universal geometry of doubling bifurcations. 

One can also show that the Feigenbaum attractor 
Ay of any map f € W*(f.) is smoothly equivalent to 
A, , which explains the universal small-scale geome- 
try of these attractors. 


Full Renormalization Horseshoe 


Along with period doublings, one can consider 
period triplings, quadruplings, etc. A unimodal 
map f €U is said to be renormalizable with period 
p if it has a cycle of intervals J J; > +++ 5], 4] 
of period p. The corresponding renormalization 
operator is defined as Rf(x)=oa 'f?(ox), where 
o= 1/2. 

The combinatorics or type T of the renormalization 
operator is the order of the intervals /,,k— 
0, 1,...,p — 1, on the real line (up to reversal). (For 
instance, there are three admissible combinatorics 7 of 
period 5.) If we want to specify combinatorics of the 
renormalization operator under consideration, we use 
notation R,. This operator is defined on the “renor- 
malization strip” 14” of unimodal maps f € U that are 
renormalizable with combinatorics 7. 

The Renormalization Conjecture admits a 
straightforward generalization to any renormaliza- 
tion operator R.. More interestingly, one can 
formulate a stronger version of it by putting all the 
admissible renormalization types together. Let 7 
stand for the set of all minimal renormalization 
types, that is, the types that cannot be factored 
through other types. Then the renormalization strips 
U',7 ET, are pairwise disjoint, and we can define 
the full renormalization operator 


R:| Jw =u [2] 
TET 


by letting R|U” =R,. Then the strong version of the 
renormalization conjecture asserted that: 


e there is an R-invariant hyperbolic subset A C U 
called the full renormalization horseshoe such 
that the restriction R|A is topologically con- 
jugate to the full shift o on the space X of bi- 
infinite sequences (...,7-1,70,71,---) of symbols 
Tn ET; 

e for any f. € A, the stable manifold W*(f,) consists 
of infinitely renormalizable maps f € U with the 
same combinatorics as f.; 

e for any f. € A, the unstable manifold W "(f.) is 
one-dimensional and represents an almost full 
family of unimodal maps; and 

e the real quadratic family {P,} transversally inter- 
sects all stable manifolds W*(f,). 
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Complex Renormalization 
Polynomial-Like Maps 


A polynomial-like map is a holomorphic branched 
covering of finite degree f : U — U', where UEU’ c 
C are topological disks (In other words, the maps f is 
proper, that is, full preimages f! (K) of compact sets 
KC U' are compact). For instance, if f is a 
polynomial of degree d then for a sufficiently large 
radius R > 0, the map f:f !(Dg) — Dg is a poly- 
nomial-like map of the same degree d. We refer to 
such polynomial-like maps as *polynomials." 

The filled Julia set of f is the set of nonescaping 
points: 


EG) miu yreUaM-90,L...] 


The Julia set of f is the boundary of its filled Julia 
set: J(f) =OK(f). 

A polynomial-like map of degree d has d— 1 
critical points counted with multiplicities. The Julia 
set (and the filled Julia set) is connected if and only 
if all the critical points c; are nonescaping, that is, 
c; € K(f). 

A polynomial-like map of degree 2 is called 
quadratic-like. The Julia set of a quadratic-like 
map is either connected or a Cantor set, depending 
on whether its critical point is nonescaping or 
otherwise. 

The domain of a polynomial-like map is allowed 
to be slightly adjusted by taking V’ to be a 
topological disk such that U c V’ C U’ and letting 
V — f! (V'). We say that two polynomial-like maps 
represent the same germ if one can be obtained from 
the other by a sequence of such adjustments. 

We will be mostly interested in the quadratic case; 
so let Q be the space of quadratic-like germs 
considered up to affine conjugacy, and let C be the 
connectedness locus in Q, that is, the subset of f € Q 
with connected Julia set. The space Q has a natural 
complex analytic structure such that holomorphic 
curves in Q are represented by holomorphic families 
f(z) of quadratic-like maps. 

Two polynomial-like maps are called hybrid 
equivalent if they are conjugate by a quasiconformal 
map P such that 0h — 0 a.e. on K(f) (in particular, h 
is conformal on int K(f)) By the Straightening 
Theorem, any polynomial-like map is hybrid equiva- 
lent (after an adjustment of its domain) to a 
polynomial of the same degree (called the “straigh- 
tening” of f). The straightening depends only on the 
germ of f. 

For a quadratic-like map f with connected Julia 
set, the straightening P,:z—z^--c is unique, 
c— x(f). Thus, we obtain the straightening map 


x:C— M, where M is the Mandelbrot set (see 
Holomorphic Dynamics). We let He — x !(c) be the 
hybrid class passing through a point c € M. One can 
show that H. is a codimension-one submanifold in Q. 

Any quadratic-like map has two fixed points 
counted with multiplicity. In the case of connected 
Julia set, these fixed points have a different 
dynamical meaning: one of them, called o, is either 
attracting, or neutral, or repelling separating, that is, 
JAN la) is disconnected. Another one, called 5, is 
either parabolic with multiplier 1 (and then it 
coincides with a) of repelling nonseparating. 

In what follows, we normalize quadratic-like 
maps so that 0 is their critical point. 


Complex Renormalization and Little 
Mandelbrot Sets 


A quadratic-like map f:U — U’ with connected 
Julia set is called renormalizable if there is a 
topological disk V > 0 and a natural number p > 2 
called the renormalization period such that: 


e letting g — /^|V and V'=g(V), the map g: V— V' 
is quadratic-like; 

e the little Julia set K(g) is connected; and 

e the sets g'"(K(g),n—1,...,p —1, can intersect 
K(g) only at the P-fixed point of g. 


Under these circumstances, the quadratic-like germ g 
considered up to affine conjugacy is called the renorma- 
lization of the quadratic-like germ f; g = Rf. Moreover, 
one says that f is primitively renormalizable if the 
little Julia sets g”(K(g)),n=1,...,p — 1, are pairwise 
disjoint. Otherwise, f is satellite renormalizable. 

As in the unimodal case, one can define combina- 
torics or type T of the complex renormalization. 
Roughly speaking, renormalizable maps with the same 
combinatorics have the same renormalization period 
and the “same position" of the little Julia sets f k(K(g)) 
in C (the rigorous definition is based on the notion of 
Thurston’s equivalence from Holomorphic Dynamics). 


Theorem 1 (Douady and Hubbard 1986). The set 
of parameters c for which a quadratic map 
P.:zez-cis renormalizable with a given combi- 
natorics T assemble a bomeomorpbic copy M” of the 
Mandelbrot set M. 


This theorem explains the presence of many little 
Mandelbrot sets that are observable on the compu- 
ter pictures of M (see Figures 3 and 4). Moreover, 
the copies corresponding to the primitive renorma- 
lization originate at primitive hyperbolic compo- 
nents (see Holomorphic Dynamics), while the copies 
obtained by a satellite renormalization originate at 
satellite hyperbolic components attached to some 
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Figure 3 A primitive copy of the Mandelbrot set. 
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Figure 4 The satellite copy of the Mandelbrot set attached to 
the main cardioid at the point of doubling bifurcation. 


- 


“mother” hyperbolic component. (Satellite copies 
attached to the main cardioid are particularly 
prominent on the pictures of M.) 

Given a combinatorial type 7, the set Q” of 
quadratic-like germs f € Q that are renormalizable 
with combinatorics 7 (the complex renormalization 
strip) is the union of hybrid classes passing through 
the little copy M”. As in the real case, let us consider 
the set 7 of all minimal combinatorial types. Then 
the corresponding renormalization strips Q’ are 
pairwise disjoint, and we can define the full complex 
renormalization operator R: (er. Q > Q. 


Renormalization Theorem 


The first proof of the Renormalization Conjecture in 
the period-doubling case was based on rigorous 


computer estimates (Lanford 1982). It followed, in 
the 1980s, by works of Epstein, Eckmann, Khanin, 
Sinai, among others, which gave a better conceptual 
understanding and provided proofs of many ingre- 
dients of the picture (without computer assistance). 

The turning point in this development occurred 
when methods of holomorphic dynamics and con- 
formal geometry were introduced into the subject 
(Douady and Hubbard 1985, Sullivan 1986). This 
led to the proof of the renormalization conjecture in 
the space of quadratic-like germs: 


Theorem 2  (Sullivan-McMullen-Lyubich, the 
1990s). For any real combinatorics T € T, the 
operator R, has a unique fixed point f, in the space 
O. Moreover, f, is hyperbolic, its stable manifold 
W(f-) coincides with the hybrid class H.,c=x(f;), 
while the real slice of the unstable manifold 
represents an almost full family of unimodal maps. 


This result was further extended to the smooth 
category by de Faria, de Melo, and Pinto. 


MLC, Density of Hyperbolicity, and 
Geometry of Feigenbaum Julia Sets 


The “Mandelbrot set is locally connected” (MLC) 
conjecture (see Holomorphic Dynamics) is intimately 
related to the renormalization phenomenon. This 
connection was first revealed by the following result: 


Theorem 3 (Yoccoz 1990, unpublished). Let us 
consider a nonrenormalizable quadratic polynomial 
P.:zez^--c with connected Julia set and both 
fixed points repelling. Then the Julia set J(P.) is 
locally connected and the Mandelbrot set is locally 
connected at c. 


This result was recently extended to higher-degree 
unicritical polynomials z— 24^ +c (Kahn-Lyubich, 
preprint 2005). 

The MLC Conjecture is still open for general infinitely 
renormalizable parameters. However, the similar pro- 
blem for the real quadratic family has been resolved. 
It implies the real version of the Fatou conjecture in 
the quadratic case (see Holomorphic Dynamics): 


Theorem 4 (Lyubich 1997). Hyperbolic maps are 
dense in the real quadratic family. 


This result was recently extended to higher-degree 
polynomials by Kozlovskii, Shen, and van Strien 
(preprint 2003). 

Infinitely renormalizable quadratic maps of 
bounded combinatorial type (i.e., with bounded 
relative periods p,,1/p,) supply us with a rich class 
of fractals with very interesting geometry. These 
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Julia sets are “hairy” at the origin, that is, their 
blow-ups fill in densely the whole plane (this 
phenomenon is related to the universal geometry of 
the Feigenbaum attractors; McMullen (1996)). 
However, some of them have zero Lebesgue measure 
(Yarrington, thesis 1995) and Hausdorff dimension 
smaller than 2 (Avila-Lyubich, preprint 2004). It is 
unknown whether this happens for all of them or 
not (in particular, the answer is unknown for the 
Feigenbaum map born in the cascade of doubling 
bifurcations). 


Regular or Stochastic Dichotomy 
Stochastic Maps 


An S-unimodal map f is called stochastic if it has an 
absolutely continuous invariant measure p. In this 
case, f is topologically chaotic (see the section 
“Topological dynamics") and y is supported on the 
transitive cycle of intervals UJ,. Moreover, u has a 
positive characteristic exponent, 


ys J log |Df|du > 0 


and Lebesgue almost all orbits are equidistributed 
with respect to u, that is, for Lebesgue a.e. x € I, 


- Y olf") = n 


for any continuous function ¢. The map f? |J is mixing 
with respect to u, and in fact, is weakly Bernoulli. 
Here are two important criteria for stochasticity: 


e Collet-Eckmann condition (see Holomorphic 
Dynamics). These maps have extra strong sto- 
chastic properties, notably, the exponential decay 
of correlations. 

e Martens—Nowicki condition. To state it, we need to 
define the principal nest of intervals, I? 5 I! >... 3 
0. Here I? =[—a, a], where o is the fixed point with 
negative multiplier, and I"*! is inductively defined 
as the component of f ^^ (I") containing 0, where l, 
is the moment of first return of the orbit of 0 to I". 
Let us consider the scaling factors 0, = |I"|/|I"-! |. If 


Y On < oo then f is stochastic. 


Let N C [—2, 1/4] be the set of parameters c for 
which the quadratic map P, is topologically chaotic. 
Not every such map is stochastic. However, the set 
of stochastic parameters has positive Lebesgue 
measure (Jakobson 1981), and in fact, 


Theorem 5 (Lyubich 2000). For a.e. c € N, the 
map P, satisfies the Martens-Nowicki condition, 
and thus, is stochastic. 


Avila and Moreira (2005) went on to prove that 
for a.e. c € N, the map P, is Collet-Eckmann. 


Renormalization Horseshoe 


Let us consider the complexification of the renor- 
malization operator [2], 


R: Jg- [3] 


TET 
acting in the space of quadratic-like maps. 


Theorem 6 (Lyubich 2002). The “Strong Renor- 
malization Conjecture" is valid for tbe operator |3]. 


Let Z C [-2, 1/4] be the set of parameters for 
which the quadratic map P. is infinitely renormaliz- 
able. The above theorem implies that this set has 
zero Lebesgue measure. (Avila and Moreira went on 
to prove that HD(Z) « 1.) 


Regular or Stochastic Dichotomy 
Putting together Theorems 5 and 6, we obtain: 


Theorem 7 For a.e. c€ [-2,1/4], the quadratic 
map P, is either regular or stochastic. 


This result gives a complete probabilistic picture 
of dynamics in the real quadratic family. It has been 
later transferred to any nondegenerate real analytic 
family of S-unimodal maps (Avila-Lyubich-de 
Melo), and further to a generic smooth family of S- 
unimodal maps (Avila—Moreira). 

Palis has formulated a strong general conjecture 
(in all dimensions) asserting that a typical (from 
the probabilistic point of view) smooth dynamical 
system f has finitely many attractors supporting 
SRB measures (see Lyapunov Exponents and 
Strange Attractors) that govern the behavior of 
Lebesgue a.e. trajectories of f. The above results 
confirm the Palis Conjecture in the setting of S- 
unimodal maps. 


Other Universality Classes 


From a more general point of view, renormalization 
is an appropriately rescaled return map to a relevant 
piece of the phase space, viewed as an operator in 
some class of dynamical systems. From this point of 
view, most dynamical systems are “renormalizable,” 
and the renormalization approach often provides a 
deep insight into the nature of the systems in 
question. 

Here is a partial list of classes of nonlinear 
systems that exhibit universality with an underlying 
renormalization mechanism (we provide a few 


relevant names, but there are many more people 
who contributed to the corresponding theories): 


e Holomorphic germs near indifferent equilibria 
(Yoccoz, Shishikura, McMullen); 

e critical circle maps (Kadanoff, Feigenbaum, Rand, 
Lanford, Swiatek, de Faria, Yampolsky); 


e non-renormalizable  quadratic-like maps of 
Fibonacci type (Lyubich—Milnor); 
® conservative two-dimensional diffeomorphisms 


near the point of breaking of KAM tori (MacKay, 
Koch); and 

e dissipative Hénon-like maps (Collet-Eckmann- 
Koch, de Carvalho-Lyubich-Martens). 


See also: Fractal Dimensions in Dynamics; Holomorphic 
Dynamics; Lyapunov Exponents and Strange Attractors; 
Multiscale Approaches. 
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Introduction 


The problem of fluid turbulence is commonly 
regarded as one of the most challenging problems of 
theoretical physics and mathematics. There is general 
agreement that the Navier-Stokes equations (NSEs) 
provide a satisfactory basis for the description of 
turbulent motions of homogeneous Newtonian fluids 
such as gases and most liquids. But the difficulty of 
generating solutions of these equations for high- 
Reynolds-number flows has prevented accurate 
answers to simple questions such as the question of 
the discharge of turbulent pipe flow as a function of 
the pressure head or the question of the heat transport 
by turbulent convection in a fluid layer heated from 
below. In view of this difficulty, it has become an 
attractive idea to obtain rigorous bounds on turbulent 
transports. Variational methods have played an 
important role in the derivation of such bounds. 

There is another motivation for the use of varia- 
tional methods for the understanding of turbulent 
fluid systems. Experimenters have sometimes noted 
the tendency of turbulent flows to maximize trans- 
ports under given external conditions. In his pioneer- 
ing paper, Howard (1963) mentions that the Malkus 
hypothesis of a maximum heat transport by thermal 
convection had motivated him to derive upper bounds 
through the use of variational methods. The techni- 
ques developed by Howard have later been applied to 
other kinds of turbulent transports by Busse. While 
relatively simple ordinary differential equations are 
obtained when the equation of continuity is not 
imposed as a constraint, the Euler-Lagrange equa- 
tions for a stationary value of the variational 
functional lead to nonlinear partial differential equa- 
tions when solenoidal extremalizing vector fields are 
required. Nevertheless, using boundary layer methods 
one can derive approximate analytical solutions even 
in the limit of asymptotically large Rayleigh and 
Reynolds numbers (Busse 1969, 1978). 


In the following, we shall first discuss the energy 
method which provides necessary conditions for the 
existence of turbulent solutions of the underlying 
equations and then turn to the problem of upper 
bounds for the turbulent momentum transport in the 
plane Couette flow configuration as a particular 
example. The properties and physical relevance of 
the extremalizing vector fields will be discussed in a 
final section. 


Energy Method 


For simplicity, we consider the NSEs for a homo- 
geneous incompressible fluid with a constant kine- 
matic viscosity v in an arbitrary fixed domain D. 
Using the diameter d of the domain as length scale 
and d*/v as timescale, we can write the NSEs of 
motion in dimensionless form, 


zoo vs Vu - Vp f - Vv [1a] 


V-v=0 [1b] 


where f denotes some given steady distribution of a 
force density. On the boundary ôD of the domain D, 
steady velocities parallel to the boundary may be 
specified. We assume that the basic steady solution 
of the problem is given by v,— Re? where the 
average of (#)*/2 over the domain D (indicated by 
angular brackets) is unity, (|9|^) —2. Any velocity 
field v, different from vs, that is, with w=v; — 
v, Æ 0, must obey the equations 


Sete Vatu Vo, cus Vu Vp + Va Za] 
V eg mu [2b] 


together with the homogeneous boundary conditions 
for u on OD. By multiplying eqn [2a] by u and 
averaging the result over the domain D we obtain 
the relationship 


(uu) = —(|Vul*) — Re(u-(u-V)d) [3] 
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where the vanishing of 4 on OD and equations such 
as 


(u - (v, - V)u) = 3(v,- Vu u) 
= NV - (vu-u)) =0 
have been used to prove that the terms 


v: Vu, u- Vu and Vp do not enter the balance [3]. 
This balance is called the Reynolds-Orr energy 
equation and is the basis for the application of the 
energy method. The lowest value Re for which the 
right-hand side of [3] is non-negative is called the 
energy Reynolds number Reg. For Re < Reg the 
steady solution v, is absolutely stable and the energy 
of any disturbance 4 must decay exponentially in 
time. Re > Reg is a necessary condition for the 
existence of a persistent turbulent state of fluid flow. 
Rey is determined as the solution of the variational 
problem: 


For a given flow v in D find tbe minimum Re, of the 
functional 
(Vält) 
Nue 4 
57 (-à- (à- Wo) 4 
among all vector fields à which satisfy the conditions 
V -u=0 in D, u=0 on OD, and (u-(u-V)v) < 0. 


For Re > Reg there will exist at least one vector 
field u, namely the minimizing solution # of the 
variational problem [4], the energy of which does 
not decay, at least not initially. In the derivation of 
the Euler-Lagrange equations as necessary condi- 
tions for stationary values of the variational func- 
tional [4 


]; 
3G (11,0, t; JT 1t, 0;U,.) = —Ójt- Oj OU [Sa] 
sü, = 0 [Sb] 


the constraint V-4=0 has been taken into account 
through the Lagrange multiplying function 7. G is a 
stationary value of the functional [4] and in general 
there exist many of those which are determined as 
eigenvalues of the linear boundary value problem [5] 
together with its boundary condition 4; —0 on OD. 
Only the infinum of all G provides the energy 
Reynolds number Reg. Many details on the energy 
method can be found in Joseph's book (1976). Here 
we just wish to remark that the Reynolds-Orr balance 
[3] remains valid when the problem is considered in a 
system rotating with a constant angular velocity Qp 
since the Coriolis force does not contribute to the 
energy balance [3]. The values of Reg are usually 
much smaller than the critical values Re. for the onset 
of infinitesimal disturbances as can be seen from 
Table 1. Here the experimentally determined values 


Table 1 Reynolds numbers for shear flows 


Reg Reg Re, 
(from exp.) 
Plane Couette flow 82.6 ~ 1300 oc 
Poiseuille flow (channel flow) — 99.2? . =2000* 5772° 
Hagen-Poiseuille flow 81.5?  =2100* oc 
(pipe flow) 
Circular Couette flow with 82.6 =82.6 82.6 
0p = Reg /2 


“The maximum velocity and the channel width d (radius d in the 


case of pipe flow) have been used in definition of He. 


Rec for the instability of the basic flow state have also 
been listed. A unique situation occurs in the small gap 
limit of the Taylor-Couette system where Rer and Re, 
coincide for a special value of the dimensionless mean 
rotation rate Qp (Busse 2002). 


Variational Problem for Turbulent 
Momentum Transport 


In order to introduce the variational method for 
bounds on turbulent transports we consider the 
simplest configuration for which a nontrivial solu- 
tion of the NSEs of motion exists: the configuration 
of plane Couette flow (Figure 1). The Reynolds 
number is defined in this case in terms of the 
constant relative motion Uoi between the plates, 
Re = Und /v, where i is the unit vector parallel to the 
plates and v is the kinematic viscosity of the fluid. 
Using the distance d between the plates as length 
scale and d? /v as timescale, the basic equations can 
be written in the form 


Z vto- Vo=-Vp+ Ve [6] 


V-v=0 [7] 


We use a Cartesian system of coordinates with the 
x, z-coordinates in the directions of ¿ and k, 


1 
> Re 


Figure 1 
problem. 


Geometrical configuration of the plane Couette flow 
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respectively, where k is the unit vector normal to the 
plates such that the boundary conditions are given by 


v=+>Rei at z= 45 [8] 
After separating the velocity field v into its mean 
and fluctuating parts, v= U +ù with ù= U, v=0, 
where the bar denotes the average over planes 
z=const., we obtain by multiplying eqn [6] by 2 
and averaging it over the entire fluid layer (indicated 
by angular brackets) 


;g leh) =— A — (war) 9 


Here u denotes the component of ù perpendicular to 
k and w is its z-component. We define fluid 
turbulence under stationary conditions by the prop- 
erty that quantities averaged over planes z= const. 
are time independent. Accordingly, the equation for 
the mean flow U can be integrated to yield 


q U= wa- (wu) — Rei [10] 
where the boundary condition [8] has been 
employed. With this relationship, U can be elimi- 
nated from the problem and the energy balance 


(|Vu|*) + (gio — (uw)|*) = Re(uxw) [11] 


is obtained where the identity (zu?) — (uw) = 
(mw — (uw)|^) has been used. 

Since the momentum transport in the x-direction 
between the moving rigid plates is described by 
M = —dU,/dz |; 1/2 = (uxw) + Re, we can con- 
clude immediately that the momentum transport 
by turbulent flow always exceeds the corresponding 
laminar value because (uyw) is positive according 
to the relationship [11]. Since a lower bound on M 
thus exists, an upper bound y on (uxw) as a 
function of Re is of primary interest. Following 
Howard (1963), it can be shown that p(Re) is a 
monotonous function and it is therefor equivalent 
to ask for a lower bound R of Re at a given value y 
of (uxw). We are thus led to the following 
formulation of the variational problem: 


Find the minimum R(u) of the functional 


(Va?) 


Riv, y) = TET; 


among all solenoidal vector fields v=u+kw (with 
ü -k=0) that satisfy the boundary condition v — 0 at 
z=+1/2 and the condition (u,w) > 0. 


The Euler-Lagrange equations as necessary con- 
ditions for an extremal value of the functional are 
given by 


dou du " 


V-v=0 [14] 
where dU*/dz is defined by 


-J2 
gU — iv — (iiv) — i(R = Ties ) [15] 
and where p= (xw) has been set. When eqns 
[13]-[15] are compared with the equations for ù 
and for U, a strong similarity can be noticed. The 
variational problem does not exhibit any time 
dependence, but the Euler-Lagrange equations may 


still be regarded as the symmetric analogue of the 
NSEs for steady flow. 


Upper Bounds on the Turbulent 
Momentum Transport 


A simple analytical solution of the variational 
problem can be obtained when the constraint 
V -v=0 is dropped. In that case it is evident that 
the minimum of the functional [12] is reached 
when v is independent of x, y, and when 
4, =W = f(z) holds. The Euler-Lagrange equations 
then assume the form of an ordinary differential 
equation, 


f"=WPMP)-1)-R+ FPF [16 


Since the variational functional [12] is homogeneous 
in Y, we are free to use a normalization condition for 
which we choose max][f(z)] - 1. Multiplication of 
eqn [16] by f' and integration yield 


n Hh 
di = 2k2(f2) (1 -k'f^ü - f^) 


with k? = p/[2(R +) -207) -u] [17] 


This equation can be solved in terms of elliptical 
integrals. The minimum R(jj) is determined by the 
relationships 


R= se + k*) + K?/D — 3E? KDY 
p = 8k? KD 


[18] 


where D(k) and K(k) are the complete elliptical 
integrals usually labeled by these letters. For 
details, see the analysis by Howard (1963) of an 
analogous problem. In the asymptotic case of large 
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Reynolds numbers, relationships [18] yield the 
upper bound 


8 
Re) = —Re 
(Re) = ¿Re 19 

In solving the full eqns [13]-[15], it is convenient 
to eliminate eqn [14] through the general represen- 
tation of the solenoidal vector field v, 


v=Vx(Vóxk)+Vyxk [20] 


We assume that the minimizing vector field ù does 
not depend on x, although a rigorous proof for this 
property can be given only for small values of p. 
Introducing the notations 0-Ov/Oy and w= 
—0* ó/Oy? we are thus led to the general ansatz 


N 
w= w= X awQ(zé.(y) [1a 
n=] 
N 
8 =0M= X On(z)bn(y) [21b] 


n=1 


where N may tend to infinity and the functions ¢,,(y) 
satisfy the equation 


2^ = -Aon [22] 
In the following, it will be assumed that the positive 
wavenumbers a,, are ordered according to their size, 
Q4] € An < 04,1. The solutions of the form [21] of 
the Euler-Lagrange equations exhibit a boundary 
layer structure for large y as sketched in Figure 2. 
Accordingly, the N-a solutions are characterized by a 
hierarchy of N boundary layers at each plate and 
provide the upper bound sequentially with increasing 
p starting with N = 1. The extremalizing vector fields 
thus exhibit a bifurcation structure similar to that 
found in many cases of the transition to turbulence. 
The thicknesses of the boundary layers decrease with 
increasing 4 and their ratio from one layer to the next 
approaches the factor 4 as indicated in Figure 3. The 
typical scale of motion increases linearly with 


-1/2 


Figure 2 Qualitative sketch of the boundary layer structure of 
the extremalizing N- a solution. 


» 
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Figure 3 Qualitative sketch of the nested boundary layers that 
characterize the vector field of maximum transport. The profile of 
the mean shear is shown on the right side. 


distance from the wall as assumed in Prandtl’s 
mixing-length theory. But the discreteness of the 
scales reflects the fact that effective transports require 
preferred scales. Asymptotically, the upper bound for 
the momentum transport approaches 


u(Re) = 0.010 Re? [23] 


which represents a significant improvement over the 
relationship [19]. Nevertheless, the upper bound still 
exceeds the measured values of the momentum 
transport by more than a factor 10. 


Discussion 


Bounds like those for the momentum transport have 
been obtained for many other kinds of turbulent 
transports. For details we refer to the review articles 
listed below. Usually, the formulation of the upper 
bound problem requires that the external conditions 
are homogeneous in two spatial dimensions such 
that a separation of the turbulent velocity, tempera- 
ture, or magnetic fields into mean and fluctuating 
parts is possible. In this respect, the variational 
methods for upper bounds are more restricted than 
those used for determination of the energy Reynolds 
number Reg. The latter problem, incidentally, 
corresponds to the limit j(,— 0 of variational 
problems of the type [12] as can be seen from a 
comparison with expression [4]. 

In recent years, the background field method has 
been introduced by Doering and Constantin (1994) as 
an alternative way for obtaining bounds on properties 
of turbulent flows. When optimized, it becomes 
equivalent to the variational method discussed in this 
article as has been demonstrated by Kerswell (1998). 
The fact that not optimized bounds can be obtained 
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relatively easily emphasizes the point that the extre- 
malizing vector fields are the most interesting aspect of 
the variational problems. They often exhibit simila- 
rities with the observed turbulent velocity fields, in 
particular as far as the mean flows are concerned. In 
the case of convection in a layer heated from below, 
the transition of the bound from the 1—«a solution to 
the 2—a solution corresponds closely to the experi- 
mentally observed transition from convection rolls to 
bimodal convection (Busse 1969). 

The close similarities between variational functionals 
for rather different physical systems suggest corre- 
sponding similarities between the respective turbulent 
fields. For example, the analogy between the fluctuat- 
ing component of the temperature in turbulent convec- 
tion and the streamwise component of the fluctuating 
velocity field in shear flow turbulence has been 
demonstrated and employed in a theory of the atmo- 
spheric boundary layer (Busse 1978). Better bounds 
and more physically realistic properties of the extre- 
malizing vector fields can be expected when additional 
constraints are imposed. For example, the energy 
balances for poloidal and toroidal components of the 
velocity field can be applied separately. But these 
developments are still in their initial stages. 


See also: Bifurcations in Fluid Dynamics; Fluid 
Mechanics: Numerical Methods; Turbulence Theories. 
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Ginzburg-Landau-type problems are variational 
problems which consider a Dirichlet-type energy 
posed on complex-valued functions, penalized by a 
potential term which has a well in the unit circle of 
the complex plane. The denomination comes from 
the physical model of superconductivity of Ginzburg 
and Landau. They are phase-transition-type models 
in the sense that they describe the state of the 
material according to different “phases” which can 
coexist in a sample and be separated by various 
types of interfaces. We start by presenting the 
physical model (readers familiar with it may wish 
to skip the next two sections and go straight to the 
section “The simplified model"). 


Introduction to the Ginzburg-Landau Model 


The Ginzburg-Landau model was introduced by 
Ginzburg and Landau in the 1950s as a pheno- 
menological model to describe superconductivity, 
and was later justified as a limit of the quantum 


BCS theory of Bardeen-Cooper-Schrieffer. It is a 
model of great importance and recognition in physics 
(with several Nobel prizes awarded for it: Landau, 
Ginzburg, Abrikosov). In addition to its importance 
in the modeling of superconductivity, the Ginzburg- 
Landau model turns out to be mathematically 
extremely close to the Gross-Pitaevskii model for 
superfluidity, and models for rotating Bose-Einstein 
condensates, which all have in common the appear- 
ance of topological defects called “vortices.” 
Superconductivity, which was discovered in 1911 
by Kammerling Ohnes, consists in the complete loss 
of resistivity of certain metals and alloys at very low 
temperatures: the two most striking consequences of 
it being the possibility of permanent superconduct- 
ing currents and the particular behavior that an 
external magnetic field applied to the sample gets 
expelled from the material and can generate 
vortices, through which it penetrates the sample. 


The Energy Functional 


After a series of dimension reductions, the Ginzburg- 
Landau model describes the state of the 
superconducting sample occupying a region €) 
and submitted to the external magnetic field hex, 
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below the critical temperature, through its Gibbs 
energy: 


)=5 | V4 wr ¿AY HO 
tifa lcurl A — 5|" [1] 


In this expression, the first unknown 4 is the 
*order parameter" in physics. It is a complex-valued 
condensed wave function, indicating the local state 
of the material, or the phase (in the Landau theory 
approach of phase transitions): l|* is the density of 
the “Cooper pairs" of superconducting electrons 
explaining superconductivity in the BCS approach. 
With our normalization |/| <1 and where |v| ~ 1 
the material is in the superconducting phase, while 
where || ~ 0, it is in the normal phase (i.e., behaves 
like a normal conductor), the two phases being able 
to coexist in the sample. 

The second unknown 4A is the electromagnetic 
vector potential of the magnetic field, a function 
from Q to R?. The induced magnetic field in the 
sample is deduced by 5 —curl A. The notation Va 
denotes the covariant derivative V — iA. The super- 
conducting current is the vector j of components 


Je = (10, (Va),V) [2] 


where (.,.) denotes the scalar product in C 
identified with R^. 

Finally, the parameter & is the inverse of the 
“Ginzburg-Landau parameter" «, a dimensionless 
parameter (ratio of the penetration depth and 
the coherence length) depending on the material only. 

Most variational studies of Ginzburg—Landau 
focus on the regime of large « or small e, 
corresponding to “extreme type-II” superconduc- 
tors, also called the London limit. In this limit, the 
potential term acts as a singular perturbation, and 
the characteristic size of the vortices is € — 0; 
vortices become line-like topological singularities, 
which makes it easier to extract and describe them. 

This model is a U(1)-gauge theory, that is, it is 
invariant under the gauge transformations: 


wr wel? 3] 
ArA+V® 
where $ is a smooth real-valued function. The 
physically relevant quantities are those that are 
gauge invariant, such as the energy G., |v|, b, and 
the superconducting current j. 
For more on the model, we refer to the physics 
literature (e.g., DeGennes (1966) and Tinkham 
(1996)). 


Reductions of the Model 


The goal of variational studies of the Ginzburg- 
Landau model is to relate the energy to the vortices 
and the applied field. In three dimensions (3D), 
vortices are filaments, or lines of zeros of the order 
parameter 4, around which wv has a nonzero 
winding number. These are quite delicate to describe 
in 3D (we will mention some results below), so a 
simplification that is commonly made consists in 
reducing to a two-dimensional model. 

When reducing to 2D, one assumes that every- 
thing is independent of the vertical direction, and 
that the applied magnetic field is also vertical. The 
domain Q is then a two-dimensional, bounded and 
(for simplicity) simply connected open set, which is 
the horizontal section of an infinite vertical 
cylinder. One can also imagine it represents a thin 
film. 

In 2D, the energy is written the same way: 


a - ue 
=3 | vav? + SS 


+ |curl A — hex)? [4] 


where this time A is R*-valued, and the induced 
magnetic field b=curlA=0,A2 — 0,4, is now a 
real-valued function, which can be taken to be equal 
to hex (now a real positive number) in RAN. 

The stationary states of the system are the critical 
points of G., or the solutions of the Ginzburg- 
Landau equations: 


1 
-(VAY y= zW- W) in 0 
—V-b = (iv, VAV) in Q [5] 
b= ba on 02 
Vawv-v=0 on O02 


where V+ denotes (—0x,, Ox, ). 

A common simplification consists in suppressing 
the magnetic field, and thus in studying the 
simplified energy 


EP 5. —puy 
-5 | iva eos 6 


where the order parameter is commonly denoted by 
u, and is still complex valued. This energy, which 
can be seen as a complex analog of the real-valued 
Allen—Cahn model of phase transitions, has been 
extensively studied, especially since the work of 
Bethuel-Brezis-Hélein, where the domain €) is 
assumed to be two dimensional and simply con- 
nected. The higher-dimensional case has also been 
considered. 


E.(u) 
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Vortices and Critical Fields 


We now need to explain more precisely what a 
vortex is. In two dimensions, a vortex is an object 
centered at an isolated zero of u (or w), around 
which the phase of u has a nonzero winding number 
called the “degree of the vortex.” It is the simplest 
example of a topological defect. If the zero is located 
at xo, the winding number or degree is the integer 
that can be computed by 


1 / Oy | 
— ——dez 7 
2r OB(xo,r) OT | | 


where r is small enough, and y is the phase of u, that 
is, 4 can be written u = |uļ|e'®. For example, the phase 
y = d0, where 0 is the polar angle centered at xo, yields 
a vortex of degree d. Observe that the phase y is not a 
well-defined function, it is multivalued (and defined up 
to 27); however, we have the important relation 


curl Vip = 27 $ dba, [8] 


where the a;'s are the zeros of u, d;'s the associated 
degrees, and 6, denotes the Dirac mass at x. 

When e is small, it is clear from [4] or [6] that |z| 
prefers to be close to 1, and a scaling argument hints 
that |u| is different from 1 in regions of characteristic 
size £. Of course this is an intuitive picture and several 
mathematical notions are used to describe the vortices. 

Vortices appear due to the applied field 5&4. For 
type-II superconductors there are essentially three 
critical fields, H.,, He, He, critical values of hex for 
which phase transitions occur. For hy, < 
H,,=O(|loge|), there are no vortices and the 
superconductor is in the superconducting phase 
I| ~ 1 everywhere. At He, the first vortices appear, 
and their number increases as he, is raised. When 
they become numerous they tend to arrange in 
triangular lattices called Abrikosov lattices, as 
observed in experiments and predicted by Abrikosov 
from the Ginzburg-Landau model, in a very 
influential work. At the second critical field 
Ha =O(1/e*) bulk superconductivity is destroyed, 
and surface  superconductivity remains until 
H., = O(1/e?), the third critical field, above which 
i = 0 and the material is normal. 


Issues and Methods 


The variational approach to Ginzburg-Landau con- 
sists in expressing the energy in terms of reduced 
quantities or objects, in particular in terms of the 
vortices. This requires to develop mathematical tools 
to describe and characterize the vortices (in particular 
give some suitable definitions of a *vortex structure" 


for a given u or p), and estimate precisely the energetic 
cost of each vortex and of their interaction. This 
allows us to obtain results of variational convergence 
of the energy G.,E. (or their variants), that is, to 
derive L-limits, or “reduced problems” posed in terms 
of the vortices, which are easier to minimize than the 
original ones. These limits depend on the regime of 
applied field, and allow to characterization of, in turn, 
the critical fields, and the optimal repartition and 
number of the vortices, if any. 

Variational methods also serve to solve some 
inverse problems, that is, to prove the existence of 
solutions of the equation which have some given 
properties, such as a given repartition of vortices, 
through local minimization procedures, or the use of 
topological methods based on investigating the 
topology of the energy levels. 

Nonvariational approaches of Ginzburg-Landau 
are also very useful, in particular to identify the 
profiles of the solutions, to describe vortices of 
nonminimizing critical points, or to perform a bifurca- 
tion analysis around the normal solution at Hea. 


The Simplified Model 


We first present the variational study of E. [6] in 
dimension 2, together with the mathematical tools 
used for both [6] and [4]. We will restrict to the 
asymptotics € — 0, since this is the situation where 
the most results are known. 

Let us present informally the essential ingredients 
of the analysis. 


Tracing the Vortices 


The easiest way to trace the vortices is to use the 
current (iz, Vu) (or the “superconducting current” 
j= (i, Vary) for the case with magnetic field). Here 
we recall (.,.) denotes the scalar product in C as 
identified with R*, that is, (iu, Vu) — (u x ĝu, u x 
du) with x the vector product in R?. 

The curl of the current is the vorticity of the map u, 
exactly like in fluid mechanics. Writing u= pe'” we 
have (at least formally) (iu, Vu) — p^ V« and since 
p=|u| is close to 1 (other than in the small vortex 
regions), we have the approximation 


curl (iu, Vu) = curl (p? V) ^ curl Vp 


-2u dd, [9] 


where the a;'s are the zeros of u (or its vortices) and 
the d;'s their degrees, or 


curl (nj, Vap) + curl A ~ curl Vp 
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in the case with magnetic field. This can be made 
rigorous (see Jerrard and Soner (2002) and Sandier 
and Serfaty (to appear)), that is, one can express that 


curl (iu, Vu) — 20 Y ^ dió, >0 ase — 0 [10] 


(or respectively curl(iW, Vay) + curlA — 27 7; dió; 
— 0) in some weak functional norm, thus giving a 
rigorous use of [8]. The quantity 


plu) = curl (in, Vu) [11] 
or 
u(y, A) = curl (iv, VAv) + curl A = curl; +b [12] 


in the case with magnetic field, will thus be called 
the vorticity and be used to trace the vortices, in this 
limit e — 0. The relation 


i — 2x S dió —0 ase—0 [13] 


states that it is close to being a measure. 

This is also called the Jacobian determinant if 
written (with differential forms) Ju=d(iu,du) = 
(idu, du) —2(uy, X u4,)dx1 \dx2, and under this 
form it can be used in higher dimensions. 


The Cost of Each Vortex 


Here we investigate informally the cost of a vortex 
of degree d. We know already that the characteristic 
length scale of variation of u is e, and that (1— 
lu^)? is strongly penalized. Thus, we may expect 
that |u| is close to 1 at a distance >e of the zeros. 
Assuming that xo is a zero of u, and taking formally 
|u| =1 for |x — xo| > e, we may write u=e'? and 
|Vu| — [Vol for |x — xo| > e. 
Then, we have 


1 2 
= Vu 
2 EN wa 


| A i 
2 Al J dr 
2 E OB(xo.r) 
2 
i 7 OPA 1 
>= — | — 14 
o | (I... 2: 27r or | | 


1 4rd ff dr i. R 
>= — = log — 15 
“E 2 J. f di E pa 


Op 
OT 


where we have used the Cauchy-Schwarz inequality 
for [14], and the characterization of the degree [7]. 
We may also observe that this lower bound is sharp 
if Op/Or is constant, that is, if the phase is d@ (and 
the vortex radial). The cost associated to |u| in the 
energy imposes the length scale £ and is generally of 


order 1 (|Vu| € C/z), thus negligible compared to 
the cost associated to the phase, which blows up as 
log 1/2 as e — 0. 

The above estimate is only valid as long as 
B(xo, R) does not contain any other zero of u. If 
vortices get close to each other or become numer- 
ous, one needs refined techniques to estimate their 
cost. This can be done through a “ball-construction 
method" introduced independently by Jerrard and 
Sandier. 


Evaluating the Total Interaction Cost of Vortices 


In a first approach, one studies configurations which 
satisfy the upper bound E.(u) < C|loge|. Then, 
lower bounds of the type [15] show that the total 
sum of the degrees (hence the total number of 
vortices of nonzero degree) remains bounded as £ — 0. 
Up to extraction, we may assume these zeros d; 
converge as € — 0 to a finite set of points p;, with a 
total degree still denoted d;. This can also be expressed 
as pfus) — 27 >>; djbp, as € — O. 

This is not the only case of interest, since 
unbounded numbers of vortices do arise, especially 
in the physical situation of the energy with magnetic 
field, as we will see in the next section. However, 
this hypothesis, which was made in the work of 
Bethuel-Brezis-Hélein, makes the analysis easier 
and already allows us to exhibit the main 
phenomena. 

Vortices in superconductors are generated by the 
presence of the external magnetic field hex. For the 
energy without magnetic field, this has to be 
replaced by some boundary condition which forces 
some degree. Bethuel-Brezis-Hélein considered the 
fixed Dirichlet boundary condition u-=g on 0%, 
where g is a fixed unit-valued map on 00, of degree 
d>0. This forces u to have a total degree d in Q. 
However, the Neumann boundary condition, for 
instance, can also be considered (the minimizers of 
E. are then simply constants, they are trivial, but 
one can still look for other critical points). 

Let us return to lower bounds in order to look 
for the next order term in the energy (still with 
formal arguments). Cutting out holes U; B(p;, p) of 
fixed size p around the limiting vortices p;, we may 
assume that 4 —e'? in MU; B(p;, p) 2 Q,, with p a 
real-valued function, defined modulo 27. Minimiz- 
ing the energy outside of the holes amounts to 
solving 


min : [Vu]? 
u: 2,5! 2 JQ, 
u=g on 00) 

deg(u.OB(p;.p))=d, 
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This is a harmonic map problem, whose solution is 
given in terms of » by 


Ay = 0 in Q, 
Op /. Og 
i € (ie x) on OX) 


/ Op = 2nd; 
OB(pi.p) or 


and in terms of the harmonic conjugate ® which is 


the function (up to a constant) such that 
Vo- V-6, 
Ao — 0 in 22, 
sie = Cj on 02) 16] 
I, -" > 2rd; 
As p — 0, ® behaves like the solution of 
AG) —2n V dió, inQ 
[17] 


Odo, u > Og 
E = Cr on Of) 


Hence, we have 


1 
; J, Ve? =a ver 
Q, 
=z ivt 


= r$ dP log + WalPi, - . dh.) 


+o(1) asp—0 [18] 
where 
Wa(a1,...,dn) =— 72, dh log |p; — p;| 
A 
-rX diR(a;) 
214 (i E) (19 
and R(x)=Bo(x) — So, di log |x — pj. The function 


W was introduced by Bethuel-Brezis-Hélein and 
called the renormalized energy, since it consists in 
the part of the energy that is left after subtracting 
the “infinite part” in |loge| from E.. It contains the 
(logarithmic) interaction energy between the vor- 
tices: we see that vortices with degrees of same sign 
repel one another while vortices with degrees of 
opposite signs attract one another. The rd? log 1/p 
term corresponds to the self-interaction, or cost of 


the vortex of core of size p; it is what replaces the 
infinite term in the formal calculation. 

Now [18] is a good estimate for the optimal 
energy outside of the holes, while the energy in holes 
of size p can be bounded below by [15]. Given the 
degree d; on the boundary OB(p;, p) of the small 
hole, B(p;, p) contains one or several zeros of u of 
degrees 6, with total degree 55,6, =dj. In view of 
[15], since the cost of a vortex of degree d Brows like 
md*| log el, and since the infimum of 37, 6? under the 
constraint 5 7, 6, =d; is 6, — sign(d;), the - costly 


_ way p: achieve this is to have |d;| vortices of degree 


sign(d;). The smallest lower bound possible is thus 


MD 
1 ul? M MJ 


where the constant C can be described explicitly. 
Adding up the results of [20] and [18], we find 


> n|di| loge +C [20 


E.(u) > TY d? log- 


223 dil log + Wa(pi,.. .. pu) 
+ oe op(1) + o«(1) 
> «Y. illog- + Walpi, -s Pn) 
1064 o.(1) 21] 


with equality only if 4 has |d;| zeros of degree 
sign(d;) in each B(pi, p). 

This provides a lower bound of the energy in 
terms of the vortices. Moreover, this bound is sharp: 
one can construct test configurations which have the 
given limiting vortices (p;,d;), and an energy equal 
to the right-hand side of [21]. 

One can thus deduce the behavior of global 
minimizers of the energy. Given the total degree 
d=deg(g)>0 on ðN, we need 7; d;— d, and the 
lowest value achievable under this constraint in 
the right-hand side of [21] is to have d;—1 for 
every i, and thus to have exactly d vortices of 
degree 1. Moreover, the limiting points pps 
should minimize W. We thus are led to the first 
main result. 


Theorem 1 (Bethuel-Brezis-Hélein). Minimizers of 
E. under the boundary condition u = g, deg(g) =d > 0, 
have d zeros of degree 1, which converge as € — 0 
to a minimizer of W. 


This result can be rephrased as a result of 
P-convergence of E.-— «d|log&e|. It reduces the 
minimization of E. to one of W, which is a finite- 
dimensional problem (interaction of point charges). 
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Thus, we see again the interest of studying this 
asymptotic limit e — O because the vortices become 
pointlike and the problem reduces to a finite- 
dimensional one, or one of minimizing the vortex 
interaction. 


Further Results 


A nonvariational approach also allowed Bethuel- 
Brezis-Hélein to prove a further correspondence 
between E. and W: they obtained that critical points 
of E., under the upper bound E. < C|loge|, have 
vortices which converge to a critical point of W. 
Other important results are the study of the blow-up 
profiles or solutions in the whole plane, by Brezis- 
Merle-Riviére and Mironescu. 

In two dimensions, the variational approach is 
also used to solve inverse problems (construct 
solutions) and study variants of the energy with 
pinning (or weighted) terms. 

The variational approach is also fruitful in higher 
dimensions. In dimension 3, for example, vortices are 
not points but vortex lines, and the Jacobian 
]u — d(14, du) can be seen as a current carried by the 
vortex line, with ||Ju|| total mass of the current equal to 
7 times the length of the line, and it was established by 
Jerrard and Soner that Ju. is compact in some weak 
sense, and converges, up to extraction, to some 7 times 
integer-multiplicity rectifiable current /, with 


lim inf TE 

In fact, a complete I-convergence result of 
E./|loge| can be proved, see the work of Alberti- 
Baldo-Orlandi, and thus minimizing E. reduces at 
the limit to minimizing the length of the line, leading 
to straight lines, or in higher dimensions, to 
codimension-2 minimal currents. This is a nontrivial 
problem, contrarily to dimension 2, where the T- 
limit of E./|loge| is trivial, which required to go to 
the lower-order term to find the nontrivial renorma- 
lized energy limit W. 


The Functional with Magnetic Field 


The aim here is to achieve the same objective: 
express or bound from below the energy by terms 
which depend only on the vortices and their degrees. 
The method consists in transposing the type of 
analysis above taking into account the magnetic 
field contribution to see how the external field 
triggers the sudden appearance of vortices, and for 
what values they appear (thus retrieving the critical 
fields, etc.). One of the main difficulties consists in the 
fact that the number of vortices becomes divergent, 


which requires more delicate estimates. Also, it is then 
no longer possible to study the convergence of the 
individual zeros of 7, so one studies instead the limit of 
rescalings of the vorticity measures (y, A). 


Splitting of the Energy and Main Results 


Let us recall that in the case with magnetic field, the 
vorticity is given by [12]. In addition, we may 
assume that the second set of equations in [5] 


-V-h=j nQ, h=h.x on AO [22] 


is satisfied (if not, keeping y fixed and choosing A 
which satisfies this equation always decreases the 
energy). Taking the curl of this equation, we find 
exactly 


—Ah + h = u(v, A) 
b = hex 


in €) 


on OX) i 


Thus, the vorticity and the induced magnetic field 
are in one-to-one correspondence with each other. 
Combining it to the relation [13], we are led to the 
approximate relation 


-Ab+h=2n%) dé, in 
i [24] 


b = Dax on Of 


where again the a;'s are the vortex centers and d;'s 
their degrees, well known in physics as the 
“London equation.” It shows how the magnetic 
field is induced by the vortices which act like 
“charges,” and how the magnetic field “penetrates 
the sample" around the positive vortex locations. 
Of course this equation is only an approximation, 
because the singularities at the a;'s, where h would 
become infinite, are really smoothed out in u(y, A); 
however, the approximation is good far from 
the vortex cores, just as [17] is an approximation 
for [16]. 

It is then natural to introduce the field corre- 
sponding to the vortex-free situation, which is he, ho 
where hy solves 


-Ah+h=0 inQ 


25 
hbo =1 on O9 25] 


bo is thus a fixed smooth function, depending only 
on Q, and when there are no vortices, we expect h to 


be approximately h.xhy. Moreover, b':— b — bubo 
then solves 
Ab! +h! = lw, A e) di in Q 
[26] 
b' =0 on OQ 
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Defining the Green kernel G(., y) b 


-AG+G=6, inf 


27 
G=0 ondQ aid 


and S by S(x,y) =27G(x, y) + log|x — yl, for x far 
enough from the a;'s, we may approximate h’ by 


(x) = 21 Y G(x,a;) [28] 


1 
Using the second Ginzburg-Landau equation [22] 
and the fact that || <1, we have IVAv| > i= IVb|, 
thus G.(w, A) > (1/2) f, IVb|? + |b — b |^. Plugging 
in the decomposition h=hexhy +h’ and using an 
integration le parts and [26], one finds 


G,(, A 31 Whol? + |bo — 11° 


=> be. 
5 Pu V bo - Vb + (ho — 1)h’ 
Q 


1 [f j 
+5 | Iv ^ + p? 
2 Jo 


= hilo + hex | (ho — 1) ub, A) 
JQ 


k 
r. / Whi? + 161? 29 
2 0 


where Jo is the constant (1/2) fo (Vhol + |ho — 1f^. 
The right-hand side of eqn [29] can be expressed 
in terms of the vortices. First, using [26], 
have fj (ho — l)u(v, A) 2 » 7;d;(bo — 1)(a;). Second, 
the expression MIR +|h'|? can be treated exactly 
like E. (u) in the previous section, using lower bounds for 
the cost of vortices provided by the Jerrard-Sandier 
method, we are led to the (approximate) relation 


1 | 1 
^ f Ve"? + IP >) di] log — 
Q : E 


= Ty did; log lai = ajl 
"nj 

+ y > did;S(a;, aj) [30] 
ij 


Combining this to [29] we find the decomposition 
G-(W, A) > be Jo +7 Y di] log e] 
= 2Tbex ` di (ho = 1)(a;) 


-— Ty did; log |a; — aj| 
ifj 
+ DD did;S(ai, aj) [31] 
Hj 
On the other hand, this inequality is sharp: as 
before, given vortices aj, one can construct a 


configuration (v, A) for which this is an equality, 
at leading order. 

In that relation, 52. Jo is a fixed energy, the energy 
of the vortex-free configuration. To it are added the 
intrinsic cost of each vortex 7|d;||log el, the interac- 
tion cost between vortices, and the interaction 
between the vortices and the external field 
2nbe 5; di(bo — 1)(ai). 

It is then simple, by minimizing the right-hand 
side with respect to the vortices for a given pex, and 
observing that bo — 1 < 0, to deduce a few basic 
facts about vortices: vortices of positive degree (and 
of degree +1) are preferred, each vortex costs 
t| loge], and allows to gain at best an energy 
2rhex max |ho — 1| when placed at the minimum of 
ho — 1. Therefore, vortices become favorable when 
their cost becomes smaller than the gain, that is, 
when hex becomes larger than the “first critical field” 


| log el 


iye 2| min(ho — 1)| 


[32] 


We have the first main result. 


Theorem 2 (Sandier-Serfaty). When e is small 
enough and hex < H.,, then minimizers of G- have 
no vortices. 


On the other hand, if hex > Hea, the vortices 
cannot all be located at the same minimum point of 
ho — 1, because their repulsion —7 ,,; log |a; — aj] 
would be infinite. There is thus a trade-off between 
their repulsion and the cost for being far from the 
minimum of bo — 1. Only if n, the number of 
vortices, is small compared to h.x do the vortices 
tend to concentrate near the minimum of hp — 1. If 
so, then, assuming for simplicity that the minimum 
of bo —1 is achieved at a unique point p, and 
denoting by O the Hessian of hy) — 1 at p, in the 
relation yi (ho — Dai can be approximated by 


min (5o — 1) + (1/2)O(a; — p) and thus G.(v, A) by 
G. (sb, A) ~h? Jo + 1| loge| + 22h. min(bo — 1) 
"T Tex S O(a; zi 
— 5 'didjlog|ai —aj|-- «r^ S(p.p) [33] 
4 


From this relation, optimizing on £, the character- 
istic distance to p and characteristic distance 
between the vortices, we find that (=,/n/h.x is 
optimal. 

Moreover, optimizing with respect to n, we find 
that n should remain bounded (as e — 0) when 
hex < He, + O(log|loge]). In that regime, rescaling 
by setting x;—((a; — p)/£), we have the following 
result: 
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Theorem 3  (Sandier-Serfaty). There exist fields 
H, ~ Ha, + C(n—1)log|loge| such that when 
H, € bex < Hny1, minimizers of G- have n vortices 
of degree 1, and the rescaled vortices xj's tend to 
minimize: 


Wii: Ea) = = Ty log Ix; — x;| 
14] 


+rn ^ O(x;) [34] 
i=1 


If hex — He, > log |log e|, then the optimal number 
of vortices n becomes unbounded as £ — 0. The 
analysis above still holds, but in order to get a 
convergence of the vortices, one needs to rescale the 
vorticity measure by z. There is an intermediate 
regime, for log|loge| < hex — Ha « |loge| for 
which n should be >1 but still n < hex, so / < 1: 
vortices are numerous, but still concentrate around p. 
Rescaling by the scale / as above, we prove that the 
density of vortices (after dividing it by 7) converges to 
a probability measure, minimizer of the energy 


Iu) - - |... logie - yl du(x) du) 
e | Olx) du) 35} 


This is an averaged/continuous form of [34]. 

If hex— He, is of order |loge|, then the optimal 
number n becomes of order /,, and the vortices no 
longer concentrate around a single point. 

The simplest approach is then to simply consider 
the vorticity measure (4, A) and to rescale it by the 
order n, hence by h.x. Then (1/h.x)p(W, A) con- 
verges, after extraction, to some measure pis. A 
continuous version of [31] can thus be written, using 
[12], as 


G.(w,A) 
1 
> 5hallogel [Ius | Ihi Mb 186 
JQ 


where 5,, solves 


—Ahy, + By, = p 
5,51 


in Q 

on 02 

Again, this inequality can be proved to be sharp (by 
a construction) and allows to show that minimizers 


of G; have a vorticity p(w, A) such that uly, A)/hex 
converges to a minimizer of 


| | 
2) 5 (im PE) f ult f Why s 


In fact the stronger result holds, in that sense: 


Theorem 4 (Sandier-Serfaty). G./h?, T-converges 


to G. 


The limit problem of minimizing G turns out to 
have a simple solution in terms of an obstacle 
problem: the optimal jp, is a uniform density of 
vortices on a subdomain of €) determined through a 
free boundary problem (and depending on P.;), 
which is nonzero. 

In all these regimes, we have thus been able to 
identify the optimal number and repartition of 
vortices through a I-convergence-type approach, 
that is, by reducing the minimization of the energy 
to the minimization of a limiting problem: w, or I or G, 
according to the regime. 


Further Results 


Concerning vortices, in the same spirit as what was 
done for E., we can obtain necessary conditions 
characterizing limiting vorticities obtained from 
sequences of (nonminimizing) critical points of 
the energy G.. They consist in passing to the limit 
in the conservative form of the Ginzburg-Landau 
equations [5]. 

Most of the results concerning the phase transi- 
tions at the next critical fields H., and He, are also 
obtained by nonvariational methods, and often by 
linear analysis. 

The study of the Ginzburg-Landau energy in non- 
simply-connected domains is also very interesting 
because it leads to nontrivial topological effects, since 
in such domains there exist unit-valued maps with 
nonzero degree (corresponding to permanent currents). 


See also: Abelian Higgs Vortices; Aharonov-Bohm Effect; 
Bose-Einstein Condensates; Gamma-Convergence and 
Homogenization; Gauge Theory: Mathematical 
Applications; Ginzburg-Landau Equation; High 7, 
Superconductor Theory; Image Processing: 
Mathematics; Superfluids; Topological Defects and Their 
Homotopy Classification; Variational Techniques for 
Microstructures. 
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Austenite-Martensite Transformations 
and the Shape Memory Effect 


Microstructures in materials that typically form in 
response to phase transformations in the solid state, 
and their impact on the elastic properties of these 
materials have been known for centuries. The 
discovery of the complex phase diagram of iron 
revolutionized the production of steels at the end of 
the nineteenth century. Starting in the 1980s, the 
mathematical description of microstructures in the 
framework of nonlinear elasticity has led to deep 
analytical questions and surprising developments in 
the calculus of variations and in nonlinear partial 
differential equations. 

The mathematical approach outlined here is based 
on the following fundamental assumptions: 


1. The observed configurations correspond to mini- 
mizers of or elements of minimizing sequences 
for an energy functional. 

2. The qualitative properties of low energy states 
are determined from the set of minima of the free 
energy density. 


Under these assumptions one aims at explaining 
experimental observations and to predict material 
properties based on minimizing an energy functional 
of the form 


I(u) = | ww dx 


Here Q is an ideal, unstressed reference configura- 
tion in R",4: — R” is an elastic deformation, and 
W : M"*" — R is the stored energy density. In the 
case of physical interest, m =n — 2 or mn —n — 3. For 
applications in elasticity we assume that m=n, but 
this assumption is not needed in the general theory. 
The energy density W and its structure depend 
critically on the temperature. However, since we are 
interested in the analysis of the material at a given 
temperature, we do not include this dependence 
explicitly. 

The key ingredient of this model is the stored 
energy density W which has to reflect the properties 
of the specific material one wants to model. 
Frequently these are alloys, in particular shape 
memory alloys that undergo an austenite—martensite 
transformation. For most materials a closed analytic 


expression for W is not available. In the spirit of the 
fundamental assumption (2) one therefore focuses 
on the structure of the set of minima of W which is 
determined from general invariance and symmetry 
principles. We may assume that W>0 and that 
K=(X: W(X)=0} 4%. The principle of material 
frame indifference then asserts that 


W(RF) = W(F) for all RE SO(n) 


Here SO(n) is the group of proper rotations, that is, 
the set of all matrices RE M”*” with R'R=Id and 
det R — 1. 

The symmetry of the austenitic (high-temperature) 
phase implies that the energy density in the 
martensitic (low-temperature) phase is invariant 
under all changes of basis that leave the underlying 
lattice in the austenitic phase invariant. Therefore, 


W(R'FR) = W(F) for all REP, 


where P, is the point group of the austenite. In the 
case of a cubic to tetragonal phase transformation, 
this leads to K — SO(3) in the austenitic phase and to 


K = SO(3)U; USO(3)U2 USO(3)U; [1] 


with 
(I — ej; Q ej) [2] 


in the martensitic phase (see Figure 1). A set of the 
form SO(n)U; is often referred to as an energy well. 

The origin of the shape memory effect is the 
availability of a rich class of geometric patterns in 
which the martensitic phases can be arranged, thus 
leading to a great flexibility of the material to 
accommodate macroscopic deformations. Upon heat- 
ing of the material above the transformation tem- 
perature, the martensitic phases lose their stability 
and the material returns to its unique shape in the 


(a) (b) (c) 

Figure 1 Two-dimensional cartoon of a cubic to tetragonal 
phase transformation in a single crystal: (a) a cubic lattice, (b) 
and (c) tetragonal variants which are stretched in directions ei 
and e», respectively. (Sketch not to scale.) 
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(a) (b) (c) 

Figure 2 Formation of phase boundaries in a single crystal. 
(a) The upper right half of the lattice deforms into phase | with 
the constant deformation gradient U,, the lower left half of the 
lattice deforms into phase Il with constant deformation gradient 
Uz. (b) An additional rotation is needed to accomplish a 
continuous deformation, see formula [3]. (c) A different config- 
uration with a different orientation of the interface. (Sketch not 
to scale.) 


austenitic phase. The two solutions of Hadamard's 
compatibility condition 


QU2,—U,;=a@b, Q«ecSO(3) 
are given by 
1 Ay? q -1 0 
O1==— | 1- 2 0 [3] 
T 0 0 n +1 


and O5; — O1 (see Figure 2). The normals (in the 
reference configuration) are given by (1, +1,0)/v2. 
It is one of the successes of the theory that it 
provides an analytical derivation of the normals to 
the twinning planes. 


The Direct Method in the Calculus 
of Variations 


The mathematical interest in the variational prob- 
lems described in the previous section lies in the fact 
that existence of minimizers cannot in general be 
obtained by a straightforward application of the 
direct methods in the calculus of variations. This 
approach is based on the idea to (1) choose a 
minimizing sequence for the functional I, (2) show 
that this sequence is bounded and precompact, 
and (3) prove that the functional is lower semicon- 
tinuous with respect to the notion of convergence, 


I(u) € lim inf I(uj) if uj—u 
j—00o 


The typical choice is to seek u; in a suitable Sobolev 
space W^P(Q; R”) with 1 < p < oo which is related 
to growth and coercivity conditions for the energy 


density W, 


cı |F|? — c € W(F) < c3(|F|P + 1) 
for all Fe M”*” [4] 


This leads to weak compactness in W'?(Q;R”) 
(weak-* compactness in W'^?*(Q; R")) and to the 
requirement of sequential weak lower semicontinu- 
ity of the functional, 


I(u) € lim inf I(uj) if u; — u in W'?(Q; R”) 
j> 


(sequential weak-* lower semicontinuity for p — oc). 
Morrey’s fundamental work establishes a link 
between convexity conditions for the energy density 
and lower semicontinuity of the variational integral: 
under suitable growth and coercivity conditions, 
sequential weak-* lower semicontinuity is equivalent 
to quasiconvexity of the integrand. 


Definition 1 A function W : M””” — R is said to be 
quasiconvex at F if 


/ W(F)dx < / W(F + Dó)dx 
Q Q 
for all de WẸ” (Q; R”) 


and for all open and bounded domains Q C R” with 
£L"(0Q) =0. It is said to be quasiconvex if it is 
quasiconvex at all F. 


In the language of nonlinear elasticity, W is 
quasiconvex if affine functions are minimizers of 
the energy functional subject to their own boundary 
conditions. The direct method implies the following 
classical existence theorem. 


Theorem 1 Suppose that W:M'"*" >R is quasi- 
convex and satisfies the growth and coercivity 
condition [4]. Let uy € W'?(Q; R”). Then tbe varia- 
tional problem: minimize I(u) in 


A= fu E WHP (Q; R”): u—u € Wy? (Q; R”)) 


bas a minimizer. 


The remarkable fact is that the structure of the 
zero set of a typical energy W modeling a phase- 
transforming material in its low-temperature phase 
prevents W from being quasiconvex. In order to see 
this, let O C R? be a cube with two of its sides 
perpendicular to b=(1,1,0)/y(2) and let h be the 
1-periodic function with h’ = 0 on (0, A) and h»'=1 on 
(A, 1) with A € (0, 1). Define v;(x) = Uix + ab( jx - b)/j 
and 


uj(x) = min{v;(x), dist(x, ðQ) } 
= min{U,x + ah( jx - b)/j, dist(x, 9Q)] 
where dist(x, OQ) = inf {||x — y||,,,y € OQ}. Then 


u; — u, u(x) = Cx strongly in L*(O; R?) and weakly-« 
in W+*(O;¡R?) with C—AU; + (1 —AÀ)Q1U; £ K 
where K is the zero set of W, see the previous section. 


H 
Aj | -Ayj 


Figure 3 Construction of a minimizing sequence u; with Du; — 
{A,B} in measure and affine boundary conditions u(x) — AA + 
(1— 4)B Hadamard's compatibility condition requires that A — 
B—acb is a rank-1 matrix and that the planar interfaces are 
perpendicular to b. 


Moreover, Du; € {U;, Q1U5] except in a small transi- 
tion layer of volume Ó(1/;) close to VO and 


I(u) — / W(C)dx > lim inf I(u;) = 0 
Q pes 

This inequality shows that the functional is not 
weakly-« lower semicontinuous and therefore W 
fails to be quasiconvex. The oscillations of u; on a 
scale 1/j are part of the mathematical model for the 
microstructures frequently observed in shape mem- 
ory alloys. More generally, whenever z is a Sobolev 
function on a domain 2 such that Du takes only two 
values, say Du € (A, B], on open sets which are not 
empty and whose union is €) (up to a set of measure 
zero), then the tangential continuity of the deriva- 
tives implies that the difference A — B is a matrix of 
rank 1, A—B—a&b, and that the interfaces 
between the regions with Du=A and Du=B are 
hyperplanes with normal parallel to b. This state- 
ment is usually referred to as *Hadamard's compat- 
ibility condition." Moreover, the pattern in Figure 3 
is known as a “simple laminate" and the matrices A 
and B are said to be rank-1 connected. 


Relaxation 


The discussion in the previous section shows that the 
variational problems related to models in materials 
science typically fail to be weakly lower semicon- 
tinuous. One approach which allows us to recover 
the macroscopic energy of the system and the macro- 
scopic stress-strain relation is to pass to the relaxed 
variational problem which involves the quasiconvex 
envelope of the energy density W. 


Definition 2 Let W:M"*" —R be given. The 
function 


W! = sup(f: f € W, f quasiconvex} 
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is called the quasiconvex envelope of W. Equivalently, 


W*(F)— inf + | W(F4 Dó)dx 
oe wi (o) [9] Jo 


This formula implies that W9* is the macroscopic 
energy of the system in the sense that it characterizes 
the smallest energy per unit volume that is required 
to subject a volume element to a deformation with 
affine boundary conditions. Here the system is 
allowed to minimize its energy with microstructures 
at any scale, a mechanism which was already 
explored in the previous section. The arguments in 
this section prove that W**(C) — 0 and this shows 
that the zero set of W** can be strictly larger than 
the zero set of K, see Definition 4. The relaxed 
functional is given by 


Ea) = | w* was 


Since W* satisfies the growth and coercivity 
conditions [4] if they are satisfied by W, the 
functional 1% attains its minimum subject to given 
boundary conditions. The functional 1% is the 
weakly lower semicontinuous envelope of I in the 
sense that minimizing sequences for I contain 
subsequences that converge to minimizers of I9% 
and for all 4 there exists a sequence u; which 
converges in W?(Q;R”) to u such that the 
energies converge, I(u;)— I(u). However, a lot of 
information in particular about oscillation patterns 
might be lost in the passage from 1 to 1% since the 
knowledge of a minimizer u for I% does not 
provide any immediate information about the 
behavior of any minimizing sequence for I that 
converges to u. Moreover, the minimization pro- 
blem required in the definition of the relaxed 
energy has been solved explicitly only for very 
special energy densities. 

In this context, one often relies on two related 
notions of convexity, one sufficient and the other 
necessary for quasiconvexity. For Fc M"*" let 
M(F) € R4”" be the vector of all minors (sub- 
determinants) of F. In the special case m=n=2 
we have M(F)=(F, det F) ER? and for m=n=3 
we find M(F)=(F,cofF, det F) € R? where cof 
F is the 3 x 3 matrix of all 2 x 2 subdeterminants 
of F. 


Definition 3 Let W:M"*" ^R be given. The 
function W is said to be polyconvex if there exists 
a convex function g:R^"".,R such that 
W(F) — g(M(F)). The function W is rank-1 convex if 
it is convex along all rank-1 lines in M'”*”, that is, the 
function t> W(F + tR) is convex for all Fe M”*” 
and all R € M”*” with rank(R)= 1. 
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All notions of convexity reduce to classical 
convexity if m=1 or n=1. In the vector-valued 
case m,n > 1 the following implications are true: 


f convex => f polyconvex => f quasiconvex 


— f rank-1 convex 


The reverse statements for the first two implications 
are not true. Rank-1 convexity does not imply 
quasiconvexity for m>3 and it is a fundamental 
open problem with deep connections to harmonic 
analysis to decide whether rank-1 convexity and 
quasiconvexity are equivalent for m=n=2. 

The polyconvex and the rank-1 convex envelope 
of an energy density W are defined analogously to 
Definition 2. In view of the implications between the 
different notions of convexity, one has WP*« 
Wa < W" and essentially all explicitly known 
relaxation formulas are based on the approach to 
construct a candidate W* for W'* and to verify that 
W* is polyconvex. Then the inequalities become 
equalities and one obtains a characterization for the 
relaxed energy. This approach does not work for 
extended-valued functions which are used in models 
for incompressible materials since quasiconvexity 
does not imply rank-1 convexity in this case. 
However, for a model system of particular interest, 
nematic elastomers, a complete characterization of 
the relaxed energy, the macroscopic stress-strain 
relation, and the macroscopic phase diagram have 
been obtained. 


Classical and Generalized Minimizers 


The discussion of observed configurations as ele- 
ments of minimizing sequences (u;) in the section 
"The direct method in the calculus of variations" 
leaves the question of the existence of minimizers 
open. The answer cannot be obtained via the direct 
methods since minimizing sequences do not need to 
converge strongly to minimizers. One approach to 
obtain the existence of solutions u with I(u) =0 is to 
solve the differential relation Du € K,u(x) — Fx on 
OQ by constructing special minimizing sequences 
that converge strongly so that one can pass to 
the limit in the energy integral. This idea has led 
to surprising solutions u with affine boundary 
conditions for the two-well problem where K — 
SO(2)diag(7, 1/7) U SO(2)diag(1/7, 7). However, the 
structure of the solutions is intrinsically complicated 
in the sense that the phase boundary has infinite 
length unless the boundary conditions are given by 
u(x) = Fx with FEK. 

More generally, the right tool to pass to the limit in 
nonlinear functions of z; = Du; like the energy is the 


“Young measure" generated by a subsequence. It is 
given by a family of probability measures v, that 
provide statistical information about the distribution of 
the values of z; close to a given point x. The existence 
and the fundamental properties of Young measures are 
described in the following theorem. For simplicity we 
assume that the sequence z; is uniformly bounded. 


Theorem 2 (Fundamental theorem on Young 
measures). Let E C R” be measurable, C"(E) < oo, 
and let z;: E— R^ be a measurable and bounded 
sequence. Then there exists a subsequence zy, and a 
weakly-« measurable map v: E— M(R) such that 
the following assertions are true: 


(i) The measures v, are non-negative probability 
measures. 
(ii) If there exists a compact set K such that u,— K 
in measure, then supp v, C K for a.e. x € E. 
(iii) If f € C(R7) and if f(z,) is relatively weakly 
compact in L'(E), then f(z,) —f in L'(E) 
where f(x) = (vx, f). 


Here (v,,f) denotes the integration of the func- 
tion f with respect to the measure vy. For example, 
the Young measure generated by the sequence Du; 
constructed in the section *The direct method in the 
calculus of variations" generates the Young measure 
Vy — (1/2)64 + (1/2)óg (see Figure 3) and 


Tu) = / W(Du;) dx 
JQ 
zi W(Y) du, (Y) dx = 0 
JO. 4 MS 


A Young measure generated by a sequence of 
gradients is called a gradient Young measure 
(GYM). It is said to be homogeneous if v, —v is 
independent of x. We restrict our attention in the 
following to homogeneous GYMs generated by 
sequences that are bounded in L*. The importance 
of quasiconvexity is also reflected in the following 
characterization of homogeneous GYMs. 


Theorem 3 A non-negative probability measure v 
is a GYM if and only if tbere exists a compact set 
K c M"*" with suppv C K and Jensen’s inequality 
(v,f) >f((v,id)) holds for all quasiconvex functions 
f : M"*" > R. 


This motivates to characterize the generalized 
limits of minimizing sequences as 


M*(K) 2 (v € M(K): f((v,id)) < (v,f) 


for all f : M"*" — R quasiconvex} 


where M(K) is the set of all probability measures 
supported on K. If v is generated by a sequence of 


functions with affine boundary conditions 
uj(x)=Fx, then (v,id)=F. The set of all affine 
deformations of the material that can be recovered 
by heating (shape memory effect) is therefore given 
as the set of all centers of mass of homogeneous 
GYMs supported on K, the so-called *quasiconvex 
hull" K! of K. 


Definition 4 Suppose that K C M”*” is compact. 
We define the quasiconvex hull of K by 


K% = {F = (v,id): v € M**(K)) 


There are several equivalent definitions of K*. 
The foregoing definition corresponds to the defini- 
tion of the convex hull of a set as the set of all 
centers of mass of probability measures supported 
on K (which satisfy Jensen's inequality for all 
convex f). The set K** can also be defined as the 
set of all points that cannot be separated by 
quasiconvex functions from K or as the zero set of 
the quasiconvex envelope of the distance function to 
K. The “polyconvex hull" K?* and the “rank-1 
convex hull" K* are defined analogously by replac- 
ing quasiconvexity with polyconvexity and rank-1 
convexity in the foregoing definitions. It follows that 
K" c K* c K™ and all of these inclusions can be 
strict. 

A particularly useful set of conditions are the 
minors conditions 


(v, M) = M((v, id)) 


for all minors M which follow from the weak 
continuity of the minors. For example, if 
K={A, B} c M?*?, then any probability measure 
supported on K is given by v = Aó4 + (1 — A)óg. The 
minors condition with M — det implies that 


det(AA + (1 — A)B) = det(v, id) = (v, det) 
— Adet A 4- (1 — A) det B 
This identity is equivalent to 
A(1 — A) det(A — B) = 0 


and therefore the quasiconvex hull is equal to K if 
and only if det(A — B) Z O0. A very instructive 
example is the set K={(1,3),(—1, —3),(—3, 1), 
(3, —1)) viewed as a subset of the space of all 
diagonal matrices in M?*?. It is frequently referred 
to as a T4 configuration. The rank-1 convex hull is 
equal to the quasiconvex hull and given by the four 
points, the line segments, and the square in the 
center, the polyconvex hull is bounded by four 
hyperbolic arcs, and the convex hull is the square 
with the points as corners, see Figure 4. It is 
remarkable that the rank-1 convex hull is strictly 
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Figure 4 The four-point subset K in the space of all diagonal 
matrices and its convex hulls: K'* = K% are given by K, the line 
segments and the shade square, K** is bounded by the dashed 
hyperbolic arcs, and the convex hull is the outer square. 


larger than the set K itself despite the fact that the 
set K does not contain any rank-1 connections. 

There are only a few examples in which explicit 
characterizations of the convex hulls for sets 
invariant under SO(n) have been obtained. For 
K —SO(3)U4 U SO(3)U; (see [2]), one finds 


a c 0 
K*—-JlFEeM??:FrF-|Íc b 0 
0 0 1/1 


~~ 


1 
ab — c = 1’, a+b+2le| <a +5 


The quasiconvex hull of the three-well problem [1] 
is not known. In two dimensions one finds for 


K = SOQ)U, U--- U SO(2)U,, 
det U; — "I1. ——— 


that 


Seung n 


KY = (E e M??: det F =1,|Fe|* < max ie? | 


All examples in which envelopes of functions or hulls 
of sets have been obtained explicitly are based on the 
exceptional property that the polyconvex envelope 
coincides with the rank-1 convex envelope. The T4 
configuration in Figure 4 is one of the few cases where 
the quasiconvex hull is known to be different from the 
polyconvex hull. The construction of quasiconvex 
functions and the understanding of their properties is 
one of the challenges left for the future. 
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of convexity and their properties can be found in 
Dacorogna (1989). Sverák proved that rank-1 
convexity does not imply quasiconvexity for m > 3 
and Milton modified his example to show that the 
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mov's concept of convex integration, by Dacorogna 
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games. The structure of solutions of the two-well 
problem with finite surface energy was analyzed by 
Dolzmann and Müller. Young measures (also called 
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optimal control problems which do not admit 
classical solutions (Young 1969). Tartar (1979) 
introduced Young measures as a fundamental tool 
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differential equations and for the passage from 
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several authors including Scheffer, Aumann and 
Hart, Casadio Tarabusi, Tartar, and Milton and 
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Introduction 
The Navier-Stokes equations 
plu, +u: Vu) = —Vp + Au +f [1] 
V-u=0 [2] 


provide the simplest model for the motion of a 
viscous incompressible fluid that is consistent with 
the principles of mass and momentum conservation, 
and with Stokes’ hypothesis that the internal forces 
due to viscosity must be invariant with respect to 
any superimposed rigid motion of the reference 
frame. Despite their simplicity, they seem to govern 
the motion of air, water, and many other fluids very 
accurately over a wide range of conditions. Thus, 
their mathematical theory is central to the rigorous 
analysis of many experimental observations, from 
the asymptotics of steady wakes and jets, to the 
dynamics of convection cells, vortex shedding, and 
turbulence. During the last 80 years, a great deal of 
progress has been made on both the basic mathe- 
matical theory of the equations and on its applica- 
tion to the understanding of such phenomena. But 
one of the most important matters, that of estimat- 
ing the regularity of solutions over long periods of 
time, remains a vexing and fascinating challenge. 
Such an estimate will almost certainly be needed to 
prove the “global” existence of smooth solutions. By 
that we mean the existence of smooth solutions of 
the initial-value problem over indefinitely long 
periods of time without any restriction on the 
“size” of the data. To date we can prove the 
“local” existence of smooth solutions, but there 
remains a concern that if the data are large, 
solutions may develop singularities within a finite 
period of time. In fact, there is a great deal more at 
issue than this question of existence. A regularity 
estimate is required to prove the reliability of the 
equations as a predictive model. That is because any 
estimate for the continuous dependence of solutions 
on the prescribed data for a problem depends upon 
a regularity estimate, as do error estimates for 
numerical approximations. A global estimate for 
the regularity of solutions is also required for a 
mathematically rigorous theory of turbulence. In 
fact, it may be hoped that the insight which 
ultimately yields a global regularity estimate will 


also be pivotal to our understanding of turbulence, 
perhaps justifying Kolmogorff theory; see Heywood 
(2003). In this article we aim to present a relatively 
simple approach to the local existence, uniqueness, 
and regularity theory for the initial boundary value 
problem for the Navier-Stokes equations, and to 
discuss some observations that bear on the question 
of global regularity. A wider-ranging review of open 
problems is given in Heywood (1990), and further 
observations concerning the problem of global 
regularity are given in Heywood (1994). 


Setting the Problem 


To focus on core issues, we shall make some 
simplifying assumptions. The fluid under considera- 
tion will be assumed to completely fill (without free 
boundaries or vacuums) a bounded, connected, 
time-independent domain Q C R”,n=2 or 3, with 
smooth boundary 90. We are mainly interested in 
the three-dimensional case, but comparisons with 
the two-dimensional case are illuminating. The R”- 
valued velocity u(x,t) — (u1(x, t), ...,u,(x, t)) and R- 
valued pressure p(x,t) are functions of the position 
X — (X1,...,X4) € Q and time £ > 0. Equation [1] is 
an expression of Newton's second law of motion, 
equating mass density times acceleration on the left 
with several force densities on the right, due to 
pressure and viscosity, and sometimes a prescribed 
external force f. Written in full, using the summa- 
tion convention over repeated indices, its ith 
component is 


CPC ANC AC 
Por | ax) | 


We will assume the density p and the coefficient of 
viscosity u are positive constants. 

In this article, we consider the initial boundary 
value problem consisting of the equations [1], [2] 
together with the initial and boundary conditions 


Ula q= 0 [3] 


u| o= U(), 


The initial velocity uo(x) is prescribed. It will be 
assumed to possess whatever smoothness is con- 
venient, and to satisfy V : 49 — 0 and up ly) =0. The 
boundary condition is a reasonable one, since fluids 
adhere to rigid surfaces. 

Notice that a further condition would be needed 
to uniquely determine the pressure, since only its 
derivatives appear in the problem as posed. We 
prefer to do without auxiliary conditions for the 
pressure, and to refer to u by itself as a solution of 


370 Viscous Incompressible Fluids: Mathematical Theory 


the problem provided there exists a scalar function p 
which together with u satisfies [1]-[3]. The problem 
is said to be uniquely solvable if there is a unique 
solution z, in which case the gradient of the pressure 
is also uniquely determined, along with the pressure 
up to a constant. Notice also that under our 
assumptions a potential force like gravity has no 
effect on u. If 4 solves the problem in the absence of 
such a force, then the inclusion of the force affects 
only the pressure, from which the potential must be 
subtracted. It turns out that the inclusion of a 
prescribed nonpotential force, while complicating 
many of the estimates below, does not affect in any 
essential way those parts of the theory to be 
presented here. Thus, for simplicity, we shall 
henceforth assume that f = 0. 


Reynolds Number 


We can make a slight further simplification of eqn [1] 
by rescaling, with the objective of setting p — 1, or even 
p=1 and p= 1. This scaling is not required for the 
existence theory we are presenting, but provides an 
important insight for the study of stability, bifurcation, 
and turbulence. The Reynolds number 


_ max |u| - |Q] -p 
u 


plays an important role in rescaling. It expresses the 
ratio of the inertial to viscous effects. The notation 
|Q| represents a characteristic length, such as the 
minimum diameter of a bounded domain. Generally 
speaking, a high Reynolds number corresponds to 
what is meant by “large” data, and the higher the 
Reynolds number the more inclined a flow is to 
instability and turbulence, and perhaps to the 
development of singularities. However, the size of 
the Reynolds number has precise implications only 
in comparing “dynamically similar” flows. We say 
that two vector fields v(x, t) and u(x,t) are dynami- 
cally similar if and only if v(x, 1) = ou(x/0, t/^) for 
some o, J,^; > 0. In such a case, if u is defined in 
Q x [0, T), then v will be defined in BQ x [0,7T), 
where BQ = {8x: x € Q}. Furthermore, if u satisfies 
the Navier-Stokes equations, then v will satisfy 


R 


pa «yv, + pa ^ Bv - Vv 
= —BVp(x/B,t/y) + a  8'u^v [4] 
which has the form of the Navier-Stokes equations if 


and only if the coefficients of the two inertial terms 
on the left-hand side are equal. That is, if and only if 


oy — 8 [5] 


in which case 
Y, +v- Vv = -Vq + qAv [6] 
with 
n = opp [7] 


and q(x,t) - o2p ! p(x/B,t/»y). We refer to such u 
and v as dynamically similar flows. The relation 
[7], that follows from [5], is equivalent to the 
equality of the Reynolds numbers for the two 
flows, 


_ max |u| -|0|- p 

= 7 

 max[jau|.|8Q|-1 - 
7] 


R(u) 
R(v) 


The condition [5] can be satisfied simultaneously 
with the condition 7=1. For example, one may 
choose 8 — 1,0 — p/p, and y= u/p. This achieves a 
rescaling of the equation to 


V; v: Vv = —Vq + Av [8] 


without changing the domain. Different Reynolds 
numbers result from varying the magnitude of the 
velocity. In what follows, we will work with the 
Navier-Stokes equation in this simplest possible 
form. 


Continuous Dependence on the Data 


We begin our investigation of the initial boundary 
value problem 


w +u- Vu =—Vp+ u, V-u=0 
for (x,t) € Q x (0,00), [9] 
U\,9 = Uo, ulan = 0 
by considering two smooth solutions, say u and v, 
taking possibly different initial values up and v. Let 
their difference be w=v — u, with initial value wo, 
and let q be the difference of the corresponding 
pressures. Then, subtracting one equation from the 
other, one obtains 


w+w-Vw+u-Vw+w-Vu=-—Vq+Aw [10] 
Multiplying this by w, integrating over Q, and 
integrating by parts, one then obtains 

1 d 


54; lel I Vwl— —Ge-Vww) — [11 
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where 
jul? = | w? dx 
Q 
Ow; Ow; 
[Vio]? = LIS 
(w-Vu,w) = D ! ww; dx 


since (and this should further explain our notation) 


(Wi, wW =f w w dx 


E Sma © lw? 


=— dai dx = —||Vw||* 
Q Óx; Ox; l 


NP iL. TEN Ow; o 
(Va,w) = | St wid = lax dx = 0 


(u-Vw,w)= | uj z— 


and similarly (w-Vw,w)=0. In deriving these we 
have used the fact that the vector fields are 
divergence free and vanish on the boundary. In the 
following, we will use such identities without further 
mention. 

We can estimate the nonlinear term on the right- 
hand side of [11] by using the “Sobolev inequalities” 


VIP 
lel < Illl vell. 

¿112 2 113/2 
alls < loll vel", 


proved by Ladyzhenskaya (1969), though with 
larger constants. These are valid for any smooth 
function ó which vanishes on the boundary of Q. It 
may be either scalar or vector valued. The norms on 
the left are L*-norms; we use the notation 
ll, — ( falo? dx) ? for any p» 1, but usually 
drop the subscript when p — 2. Using first Hólder”s 
inequality and then [12], one obtains 


iln-—2 


12 
ifn-3 12 


(w Va, w)| < lwllslVul| 
y | jen] [Vio] || Vel 
T (ol vw? vul ifn =3 


if 54 —2 


Young's inequality 


ab « tp af um 
p q 


holds if a, b > 0,p,q > 1 and 1/p + 1/q — 1. Taking 
a—wv2|Vw|, along with p=q=2 in the two- 
dimensional case, and a—(4/3)"^||Ww||, along 
with p —4/3, q—4 in the three-dimensional case, 
one obtains 


(w - Vu, w)| 


a. 5 p 2 
: pee +3 || Vuelo 
|| Vl +32 |Vu u\|* g 


i w= 2 [13] 
it a= 3 


Using these estimates for the right-hand side of [1 I; 
we obtain linear differential inequalities for ||1w||* 
that are easily integrated to give 


llt) |^ 
tema den ifn=2 [14] 


2 4 4 
|wol| exp DELATI dr, ifn=3 


It follows that if we can estimate the integrals on 
the right, which concern only the solution z, and if 
v is a second solution, perhaps differing only 
slightly from u when t=O, then we can estimate 
the difference ||v(t) — u(t)|| at later times. Moreover, 
at any particular time this difference will be 
bounded proportionally to ||v(0) — u(0)||. The inte- 
gral on the right-hand side of the two-dimensional 
version of [14] is easily estimated using the energy 
estimate [16] below. The estimation of the corre- 
sponding integral in the three-dimensional case, 
without a restriction on the size of the data, 
remains an open problem. It can be regarded as 
the most important open problem in the Navier- 
Stokes theory. It would never be enough to some- 
how prove that solutions are smooth without 
estimating this integral, or something equivalent 
to it. Of course, if solutions were known to be 
smooth one could infer their uniqueness from [14], 
since smoothness would imply that the integrals are 
finite, which is enough to conclude that ||w(t)|| is 
zero if |Iwo|| is zero. 


Energy Estimate 
If one multiplies the Navier-Stokes equation for u 
by u, and proceeds as in deriving [11], one obtains 
= 15 
TE lle? + Ve? 15 


and hence 


1 : 1 
¿les [Iv ar == lol? (16 
0 
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This settles the matter of continuous dependence in 
the two-dimensional case. Together with [16], the 
two-dimensional version of [14] implies 


l(t) < lwollexplwol. ifm=2 [17 


We remark that the local rate of energy dissipa- 
tion is 2|Du|^ rather than |Vu|^, where Du is the 
stress tensor Du=(1/2)(Vu+ (Vu)l). However, 
integrating over the domain, and integrating by 
parts using the boundary condition u|,.=0, one 
may verify that the rate of total energy dissipation 
2|Du|^ equals ||Vu|. For the purpose of this 
article, it is convenient to write the energy identity 
as [15]. 


Estimates for | Vu(t)| Pointwise in Time 


Of course, an estimate for || Vz(t)|| pointwise in time 
would imply an estimate for the integral of ||Vu(t)||* 
on the right-hand side of [14]. We can prove such an 
estimate for at least a finite interval of time by an 
argument due to Prodi (1962). It requires, in 
preparation, some deep results concerning the 
regularity of solutions of the steady Stokes equa- 
tions. These cannot be proved here, but we can 
briefly summarize what will be needed. Let 


L^(Q)—space of vector fields 4, with finite 
L?-norms ||ó||, 

Co (Q) — space of smooth vector fields with compact 
support in Q, 

D(Q) = (6 € Cy (Q): V - d= 0], 

J(Q) 2 completion of D(Q) in the L?-norm ||ó||, 

J4(9) 2 completion of D(Q) in the norm || Voll, 

G(Q) 2 (Vp:p € L*(Q) with Vp € L^(Q)), and 

P:L^(Q)—J(Q) be the L^-projection of L^(Q) onto 
J(Q), 


and define the Sobolev W;(Q) norm by 
llt lul +1] Val? 


+ / |l? u;/Ox;Ox, |" dx 
0 


Furthermore, observe that (Vp,¢)=0 for Vp € 
G(Q) and $ € J(Q), since it holds if p is smooth 
and @¢€D(Q). Therefore, PVp=0, since 
(PVp, ¢)=(Vp,¢)=0, for all $€J(Q). Later, 
when we need it, we will also argue that 
L*(Q) =J(Q) 9 G(Q). 

With these preparations, it is evident that every 
smooth vector field 4 satisfying V-u=0 and 
4|; — 0 can be regarded as a solution of the steady 
Stokes problem 


—Au + Vp = f and V -u 20 in Q u|54—0 [18] 


with f = —PAu. For such solutions, and hence for 
all such u, we have the estimates 


lallwa) € c||PAu|| [19] 
and 


ifn=2 
fm. =3 


22) cl 11 PAM, 
sap Peas dl 
with constants independent of u. It can also be 
shown that every such vector field u belongs to J, ((2) 
and hence to J(Q); see Heywood (1973). 

Some history and remarks are in order. The 
inequality [19] was proved independently by 
Solonnikov (1964, 1966), and by Prodi’s student 
Cattabriga (1961). In fact, they gave L? versions of 
it for all orders of the derivatives. Several proofs 
specific to the L? case needed here have been given 
by Solonnikov and Séadilov (1973) and by Beiráo da 
Veiga (1997). The inequalities [20] can be proved by 
combining [19] with appropriate Sobolev inequal- 
ities, or better, by combining [19] with recent 
inequalities of Xie (1991) which are of precisely 
the form [20], but with Az instead of P Au on the 
right-hand side, and without the requirement that 
V -u=0. The constant c in [19] depends upon the 
regularity of the boundary, and tends to infinity 
along with a bound for the boundary curvature. 
Through the work of Xie (1992, 1997), there is 
reason to believe that the inequalities [20] are 
probably valid for arbitrary domains, with the 
constant c — (22) ! if n —2, and c— (32) ! if n=3. 
Xie's efforts to prove this have been continued by 
the author (Heywood 2001). If the inequalities 
[20] can be proved for arbitrary domains (i.e., 
arbitrary open sets), with these fixed constants, 
then the approach to Navier-Stokes theory pre- 
sented in this article will extend immediately to 
arbitrary domains, as explained in Heywood and 
Xie (1997), with estimates independent of the 
domain. 

We go on now with an estimation of ||Vz(t)|| 
based on [20]. Multiplying the Navier-Stokes 
equation for u by —PAu, and integrating over Q, 
one obtains 


1 d 2 a 
NA! + || PAu] = (u - Vu, PAu) 


S sap wv iPAQ] 


since (u+, —PAu)=(Pu;, —Au)=(u;, Au) =(Vu;, Vu) 
and (Vp, PAu)=0. 
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The right-hand side of [21] can be estimated using 
[20] and Young’s inequality: 


rj iud Vu] || PA | 


: | cul "^ Vull PAu], ifn =2 
~ Lei Val ^ PAMPA, ifn=3 
, pete + ¢|lu||*||Vull*, if n — 2 
~ LHUIPAul* + cl Vul, ifn=3 


Thus, 


E [Vu]? + PAu] 


< | elul?laul*, if » = 2 


[22] 
cl Vall, if n=3 

These differential inequalities are at the core of 
present theory. Consider first the two-dimensional 
case. It can be viewed as a linear differential 
inequality 


d 2 2 2 2 
3; IVwl^ < (eleva?) Va? Q3 


with a coefficient c||u||^|| Vu||? that is integrable, in 
view of the energy estimate [16]. Integrating it yields 
a “global” estimate; for all t > 0, 


t 
| Vu(t) |? < [Vo]? exp / lul I Va dr 


1 
< || Vaso |" exp 5 llo!" [24] 


However, if the three-dimensional version of [22] 
is viewed as a linear differential inequality, the 
coefficient to be integrated is ||Vu(t)||*. Thus, the 
same integral which is crucial to proving continuous 
dependence on the data is also crucial to proving 
regularity. What we can do in the three-dimensional 
case, is view [22] as a nonlinear differential inequal- 


ity of the form 


e Xe or e €el|Vul v [25] 


for y(t) = || Vu(t)||^. Integrating the first of these, one 
obtains a local estimate 


|| Vo] 


M obe s [26] 
4/1 — 2c||Vuo||*2 


1 
Kid 
2c||Vuo|| 


|| Vs < 


for 


without any restriction on the size of the data. 
Integrating the second, one obtains a global estimate 


2 
vw)! < —— A 
1 — e| Voll" Jo l| Vul dr 
Vull? 
< | oll E [27] 
1 — (c/2)|uo] [Vol 
valid for all t > 0, provided 
2 
loli Vuol" < = [28] 


This is a good interpretation of what we mean by 
“small data.” If Xie’s conjecture is correct, that the 
constant in the three-dimensional version of [20] is 
c=(3n)', then we obtain [25]-[28] with the 
constant c= 3/(128z?). Thus, 2/c ~ 842. 


Further Regularity, Smoothing Estimates 
Once one has an estimate of the form 
\|Vu(t)|| € Mt), forO<t<T [29] 


as provided by [24], [26], or [27], one can estimate 
the solution's derivatives of all orders over the open 
time interval (0, T). The initial time t=0 must be 
excluded from the interval, because the “imperfec- 
tion" of prescribed data generally causes an impul- 
sive acceleration along the boundary at time zero, 
resulting in a thin boundary layer in which the 
derivatives are so large that || Vz;(t)|| and llevo) 


tend to infinity as 2 —^ 0*. But the solution quickly 
smooths and remains smooth as long as [29] 
remains in force. Thus, our working assumption up 
to this point, that solutions are C* smooth in 2 x 
[0,00) is not valid at t=0. However, we will see 
that they are smooth in 2 x (0, T) and continuous in 
Q x [0, T). They are also continuous on [0, T) in the 
W3(Q) norm. This is sufficient regularity to justify 
everything that we have done to this point. 

In this section, we give estimates for the derivatives 
of all orders with respect to time, of z and its first- and 
second-order derivatives with respect to space. In the 
next section, we will prove an existence theorem by 
Galerkin approximation. It will be easily seen that all 
of the estimates proved in this and previous sections, 
for solutions that are assumed to be smooth, also hold 
for the approximations, without any unproven 
assumptions. Therefore, they will be inherited by the 
solution that is obtained upon passing to the limit of 
the approximations. At first, this solution will be 
something of a generalized solution, not fully classical, 
but one which is C? with respect to time over the 
interval 0 < t < T, in the W;(Q) norm. In a final step, 
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viewing u at any fixed time as a solution of the steady 
Stokes equations, we can apply regularity estimates for 
the Stokes equations to infer that it is C* in all 
variables throughout 2 x (0, T), with specific esti- 
mates for each derivative. 

The estimates of this section are obtained by 
integrating an infinite sequence of differential inequal- 
ities, for IKIE [Vul], IKAR DAP IZ || Vus], cm 
The first two are [15] and [21], which have already 
been dealt with. It turns out that after these first two, 
each succeeding differential inequality is linearized by 
the estimates obtained from its predecessor, which 
explains why the time intervals for these additional 
estimates do not become successively shorter. In fact, 
in the two-dimensional case, the energy estimate 
resulting from [15], which is valid for all time, already 
gives the linearization [23] of [21], which then 
provides an estimate valid for all time. Except for 
noting such differences between the two- and three- 
dimensional cases, we will henceforth deal with only 
the three-dimensional case. 

The differential inequalities just mentioned are 
obtained by estimating the right-hand sides of two 
sequences of differential identities, and ordering 
them by an iteration between the two sequences. 
The first sequence begins with and is patterned after 
the energy identity, 


à agn + [Vu]? = 


(1t; - Vu, 7 


E: d jul? T [Vl]? = 
[30] 


2 2 
5 lee T MuR = — (ug VU, uy) 


== 2(u, . Vg) 
etc. 


while the second begins with and is patterned after 
Prodi's identity, 


l 
m I Vul + ||PAu\|* = (u - Vu, PAu) 
2 d iu, + | PAu = (14; . Vu, PAu) 
+ (u: Vun PAu) [31] 


DA + |PAu, 1? = = (ug - Vu, PAuy) + 


etc. 


Before going on, notice that we can return to |22] and 
use [29] to infer a more complete estimate of the form 


t 

vao)? + / |PAu|f dr 
0 

< B(M,t), 


[32] 
for0<t<T 


containing an integral of ||PAu||? on the left-hand side. 
We will use the notation B(M,t) generically, for any 
bound that depends only on the function M(t) and t. 
We remark, that a term ||x,||7 can also be included 
under the integral sign on the left-hand side of [32], 
because ||m;|| and ||PAz|| are of essentially the 
same order, being the leading terms in the projection 
u; + P(u - Vu) - PAu of the Navier-Stokes equation. 
Finally, one can also include || || w2 (o, under the 
integral sign, in view of [19]. 

Going on, we obtain a third differential inequality 
from the second identity of the sequence [30]. Its 
right-hand side admits the estimate 


— (ur - Vu, u) < ||ullsll Veal] 
< c|uet Vul]? || Veal] 
1 
< 5 Vur? + ell Vill" luce] [33] 


which, in view of [29] or [32], produces a linear 
differential inequality with integrable coefficients. 
Its integration yields an estimate of the form 


t 
Iu, CIE + / l| Vul? dr 
< B(M,t, ||u(0)]|). 


[34] 
for0<t<T 


provided ||;(0)|| is bounded. Since w,— P(Au-— 


u- Vu), we have the estimate 


llus (0)]] = || P(Auo — uo - Vuo)]| 
< ||Auo — uo - Voll < B (llvollwato; ) [35] 


provided that u is smooth in Q x [0, T). This is a 
delicate point, having been forewarned of a regular- 
ity breakdown at £—0. But, we will be able to 
replicate the estimate [35] for the Galerkin approx- 
imations, ultimately validating [34] for the approx- 
imations and the solution. 

The integration of the next differential inequality, 
which arises from the second of the identities [31], 
requires that ||Vz,(0)|| < oo. Similarly to [35], we 
have 


|| V25;(0)]| = || VP(Auo — uo * Vuo)|| 


< B(|lxollwa(a)) 


provided that u is smooth in Q x [0, T). However, 
there is a big difference between [35] and [36]. In the 
next section, we will not be able to obtain an analog of 
[36] for the Galerkin approximations. Consequently, 
the solution that is obtained will not be fully regular at 
time £ — 0. It will satisfy u € C(Q x [0, T)) N C* (Ox 
(0, T)), but not u € C*(Q x [0, T)). It will satisfy 


[36] 
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u(t) — uo || y2(0) >0 but not ||m(t) — Uo|| w3(a) > 9 
ast—0*. © 

One may wonder whether this is a fault or 
deficiency in the Galerkin method. It is not, 
remembering what was said at the beginning of 
this section. For most prescribed values of uo, no 
matter how smooth, there is a breakdown in the 
regularity of the solution as t-+0*. In fact, it was 
proved in Heywood and Rannacher (1982) that if 
|| Vz;(t)| or any one of several other quantities, 
including ||z(t)||y3;q), remains bounded as t— 0", 
then there exists a solution po of the overdetermined 
Neumann problem 


—Apo = Ys (uo è Vuo) in Q 37] 
VPolan = Autol5o 


Generically speaking, this problem is not solvable, 
and therefore 


Vu;(t)|| = oo 


lim sup, .9. 


We mention that under our assumption that uo is 
smooth, the correctly posed Neumann problem, 
with boundary condition Opo/0n|5o = Auo : n|5o, is 
uniquely solvable for a solution po € W3(Q)/R, and 
||Vp(t) — Vpoll ^0, as t—0*; see Heywood and 
Rannacher (1982). 

Since solutions are smooth for 0 <t< T, the 
pressure in the Navier-Stokes equations satisfies the 
overdetermined Neumann problem for all t € (0, T). 
So it may seem appropriate to require that the 
prescribed initial value up be a function for which 
problem [37] is solvable. We do not agree with that. 
It is too difficult, if not impossible, to find such 
functions, except by solving the Navier-Stokes 
equations. For example, one might think that the 
condition that [37] should be solvable might be 
satisfied if 49 € D(Q), since such functions are zero 
in a neighborhood of the boundary. In fact, K 
Masuda has shown that if Q is a three-dimensional 
sphere, then the overdetermined Neumann problem 
[37] is never solvable for nonzero up € D(Q). Hence, 
the gradient of the initial pressure will have a 
nonzero tangential component, causing an impulsive 
tangential acceleration along the boundary. 

If we are to use the Navier-Stokes equations to 
make predictions of the future, we must solve the 
initial boundary value problem for “man-made” 
initial values, and accept the fact that there is a 
momentary breakdown in regularity along the 
boundary, immediately following the initial time. 
Thereafter, the solution smooths as “nature” takes 
over. To prove the reliability of our predictions, we 
need continuous dependence estimates and error 
estimates for numerical methods that take into 


account this initial breakdown in the regularity. 
The continuous dependence estimate [14] meets this 
requirement. So also do the error estimates given in 
a series of four papers by Rannacher and the author, 
beginning with Heywood and Rannacher (1982). 
They were based on the “smoothing” regularity 
estimates for solutions that are being presented here. 
We go on with these now, as models for similar 
estimates for the Galerkin approximations. 

Estimating the right-hand side of the second of the 
identities [31] using [20] and Young’s inequality, 
and then multiplying through by £, we get the linear 
differential inequality 


d 2 2 
q, (Iul) + tl PA 
< [Vis -e( IIl |] Vu]? 
+ ||PAu||”) (ell vial) [38 


for t| Vu,| , with coefficients that are integrable in 
view of the previous estimates [32], [34], and [35]. 
Therefore, its integration yields an estimate analo- 
gous to [32] of the form 


f 
rji | e] Pase dr 
0 
< B(M,¢, oll waa) for0<t<T [39] 


provided its “initial value" is finite. It is, due to the 
time weight, in the sense that 


lim sup, o (elvis?) =0 [40] 


This is proved by noting that if the lim sup were 
positive, then the integral on the left-hand side of 
[34] would be infinite. Finally, a term t||:;;||^ can be 
included under the integral sign on the left-hand side 
of [39], because ||uy|| and ||PAu;|| are of essentially 
the same order, being the leading terms in the 
projection zj5 + P(u;- Vu +u- Vu,) — PAu, of the 
time differentiated Navier-Stokes equation. 

We continue inductively. Estimating the right- 
hand side of the third of the identities [30] using 
[12], [20], and Young's inequality, and then multi- 
plying through by £^, we get the linear differential 
inequality 


d 
dc (Pl?) + Vee 
< 2t|us| +27 Iu P PA 
+e( (Vu Vult) (Pls?) 41 


with coefficients that are integrable in view of 
preceding estimates. In particular, the integrability 
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of the first term on the right-hand side follows from 
the boundedness of the integral 


t 
] rests 42] 
0 


which, we have pointed out, can be included on the 
left-hand side of [39]. Finally, notice that the 
boundedness of the integral [42] implies 


lim sup, o: GS =Ü [43] 


Therefore, we can integrate [41] to get the estimate 
2 i 2 
PDA | PIVan? dr 
0 


< B(M,t, oll aca) for0<t<T [44] 


analogous to [34]. 

At this point, we have introduced every device 
needed to proceed by induction to an infinite 
sequence of time-weighted estimates, similar to 
[39] and [44], but with successively higher orders 
of time derivatives and weights. The dependence of 
these estimates on ol we) was introduced through 
[34] and [35]. It can be eliminated by beginning the 
introduction of powers of t as weight functions one 
step earlier, with the added advantage that the initial 
velocity uo needs only belong to J,(Q). In the two- 
dimensional case, the weight functions can be 
introduced even another step earlier, with the 
advantage that the initial velocity up needs only 
belong to J(Q). Each of these cases leads to an 
existence theorem for solutions 4 € C*(Q x (0, T)), 
with the initial values assumed in the norms of J,(€)) 
and J(Q), respectively. 


Existence by Galerkin Approximation 


Let {a',a*,...} and (41, A2,...} denote the eigenfunc- 
tions and eigenvalues of the Stokes equations, 


—Aa* + Vp = Aaf,  V.a* 20 inn 
d* |. —0 [45] 
chosen to be orthonormal in L^(Q). Clearly, 


—PAa, — X,a*, so they are also the eigenfunctions 
and eigenvalues of the Stokes operator, — PA. Using 
regularity estimates for the Stokes equations, each 
eigenfunction is known to be C* smooth in 2. 

The zth Galerkin approximation for problem [9] 
is the solution 


t) — 5  Cen(t)a* (x) 
k—1 


of the system of ordinary differential equations 


(upa!) + (u . Vu”, a!) = (As”, a!) 


for f= 1,2,... ;u [46] 


satisfying the initial conditions (u”(0), a ) = (to, a! ), 
for [251,2,...,8. OF es since (u”, a!) = Oc),/Ot 
and (Au",a l) = (PAu", a')=—Xjc;,, the differential 
equations can be written as 


cp = g^ Cin Cin (a. -Va',a ') d AiCin 


and the len conditions as cj,(0) — (u"(0),a!), for 
a oe 

The sar [46] is at least locally solvable, on 
some interval [0, T,,), with each coefficient satisfying 
Cin € C™[0, T,,). Therefore, since the eigenfunctions 
are also smooth, u” is C* smooth in Q x [0, T,,). It 
also satisfies all of the identities [30] and [31] on the 
interval [0, T„). Indeed, multiplying [46] by cm and 
summing over / from 1 to z has the effect of 
converting a! into u”. The resulting identity for u” 
leads immediately to the energy identity 


1 d 

5 "IP + IVa" |= o 47 
The remaining identities in the sequence [30] are 
obtained similarly. For example, the second is 
obtained by taking the time derivative of [46], 
multiplying through by dc,,/dt and summing over /. 

Prodi’s identity is obtained by multiplying [46] by 
AiCm and summing, which has the effect of convert- 
ing a! into —PAu". To obtain the second of the 
identities [31] for u”, one differentiates [46], multi- 
plies by A;dc),/dt and sums. The remaining identities 
in the sequence [31] are obtained similarly. 

The initial conditions easily imply that ||u”(0)|| < 
[9 ||, because uo € J(Q) and the eigenfunctions are 
orthogonal and complete in J(Q). Therefore, inte- 
gration of [47] yields the energy estimate 


sere + f veldr <5 lol? 148] 


which is uniform in n. Since ||u”(t)|| remains bounded, 
the solution u”(t) can be continued for all time. Thus, 
T, — oc, for all n. Hence, our early working assump- 
tion about solutions, that they are smooth in 2 x 
[0, oc), is actually valid for the Galerkin approxima- 
tions. The issue becomes one of obtaining estimates 
for their derivatives that are uniform in x. All of the 
estimates we have proved for solutions are proved in 
exactly the same way for the approximations. The 
only possible source of nonuniformity would arise 
from the initial values of || Vz"|| and ||x} ||. 
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The estimates [24], [26], and [27] are uniform in 
n, since uo € J,(Q) and hence ||Vz"(0)|| € ||Vuo|l|, 
due to the orthogonality of the eigenfunctions in the 
inner-product (Vu, Vv), and their completeness with 
respect to functions in J,(Q). We also obtain a 
uniform bound for ||;7(0)|| of the form [35], by 
multiplying [46] by 0c,,/0t and summing over l. In 
the last step, we also need the inequality 
lu” (0) wea) < limo ll we), which follows from the 
orthogonality of the eigenfunctions in the inner 
product (PAu, PAv), and their completeness with 
respect to functions in J,(Q)M W3(Q); see 
Ladyzhenskaya (1969, p. 46). Any attempt to find 
a bound for ||Vz7(0)|| analogous to [36] is certain to 
fail, as it would lead to a contradiction with afore- 
mentioned results from Heywood and Rannacher 
(1982). 


Passage to the Limit 


We now have L?-bounds for u”, Vu", u”, Ou" /OxjOx;, 
and Vu? over any space-time region Q x (0, T"), with 
0 < T' < T. We also have L?-bounds for all orders of 
the time derivatives of these quantities over any 
subregion Q x (e, T', with 0 «e « T' « T. From 
these L?-bounds, we may infer the existence of a 
subsequence of the Galerkin approximations, again 
denoted by (u”), which converges, along with those of 
its derivatives for which we have bounds, to a limit u 
and its derivatives. The convergence u”—u and 
Vu" — Vu is strong in L^(Q x (0, T')); the conver- 
gence of u? is strong in L^(Q x (e, T')) and weak in 
L^(Q x (0, T^); the convergence of PAu” is weak in 
L^(Q x (0, T')); all time derivatives of u”, Vu” con- 
verge strongly in L^(Q x (e, T")). 

Because of estimates for the time derivatives, trace 
arguments give the strong convergence u”—u, 
Vu" — Vu,u; —u, and the weak convergence 
PAu" — PAu, in L^(Q), for every t > 0. 

For any fixed time, u € W3(Q), and therefore u is 
continuous in € by a well known Sobolev inequal- 
ity. Since u € J (Q), it must equal zero along the 
boundary. The estimates for the time derivatives of 
u”, Vu", @ u” /ðx;ðx; imply that u and its time 
derivatives are time continuous in W3(). There- 


fore, u,u;,,45,... are classically continuous in 
Q x (0, T). 


Introduction of the Pressure 


Because of the strong convergence u” =u, Vu"— 
Vu,u;—u,; and the weak convergence 
PAu" — PAu, in L'(Q), for any t > 0, it is an easy 
matter to let 7 — oc in [46], obtaining, for all t > 0, 


(1, a!) + (u - Vu, a!) = (Ax, a”) 
far t= 1,2... [49] 


Since the eigenfunctions are complete in J(Q), and 
D(Q) cC J(Q), this implies 


(u, +u: Vu -— Au,ġ)=0, forall pe D(Q) [50] 


Therefore, there exists a vector field Vp € G(Q) such 
that 


u; +u- Vu — Au = —Vp [51] 


Indeed, the usual test to determine whether a 
smooth vector field w is conservative in some 
domain Q, and therefore representable as a gradient, 
is to check whether the curve integrals 


fw -rds [52] 


vanish for every smooth closed curve C c Q. Here, 7 
is the unit tangent to the curve and ds is its arc 
length. With a little reflection, one will realize that 
these curve integrals can be approximated by 
volume integrals of the form (1,6) with ó € D(Q). 
For this, one should choose ó to have its support in 
a small tubular neighborhood of the curve, and its 
streamlines parallel to the curve, with unit net flux 
through any section of the tube. If w is not smooth, 
but only known to belong to L*(Q), one can 
approximate it with its smooth mollifications. This 
argument can be made rigorous. We previously 
showed that J(Q) and G(Q) are orthogonal sub- 
spaces of L*(Q). Now we have argued that 
L*(0) =J(Q) 9 G(Q). 


Classical C^ Regularity 


At any fixed time, we may regard z as a solution of the 
steady Stokes problem [18] with f — —4; —u- Vu. 
Included in Cattabriga (1961) and Solonnikov (1964, 
1966) are regularity estimates for all orders of 
derivatives of the form 


lll yyt+2(0) < ell ll waco) 


From our estimates above, we easily conclude that 

= —u—u:Vuc Wi(Q) Hence u € W3(Q). In 
fact, in view of the regularity we have proven 
with respect to time, f € C*(0, T; W;(Q)) and u € 
C~(0, T; W3(Q)). Thus begins a bootstrapping argu- 
ment. In the next step, we observe that f € 
C*(0, T; W3(Q)) and conclude that u € C*(0, T; 
wi). By induction, one obtains u € C*(0, T; 
W5(Q0)) for every positive integer k. Then well- 
known Sobolev inequalities imply that u € C* 
(Q x (0, T)). 
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Assumption of the Initial Values 


We begin by showing that u(t)— up, weakly in 
L? (Q), as t — 0*. Of course, ||u(t)|| remains bounded 
as £— 0*, in virtue of [48], and the eigenfunctions 
{a'} are complete in J(Q). Writing 


(wt?) — 10, a!) = (u(t) -— u" (t),a') 4 CO 
— u"(0),a') + (w^(0) — ito, d! 
note that the first and third terms on the right-hand 


side can be made small by choosing n large. The 
second can be written as 


(wt) - u"(0),a") = f (2) dr 


which will be small if £ is small, in view of [34]. 
Thus, (u(t) — 49,4.) >0, as t— 0*, which implies 
the desired weak convergence. 

The strong convergence u(t) — uo in L^(Q) follows 
from the weak convergence if lim sup, „o+ ||u(£)]| < 
li4o]]. The energy estimate [48] for the approxima- 
tions implies this also. 

To conclude that u(t) > uo strongly in J, (Q), it only 
remains to be shown that limsup,_, 9: ||Vz(£)|| € 
|| Vuo||. This readily follows from [29], provided the 
bounding function M(t) satisfies M(t) — ||Vuo||, as 
t —^ 0*. The bounding functions provided by our basic 
estimates [24], [26], and [27] all have this property. 

We may conclude that u(t) — uo weakly in W3(Q), 
provided ||u(t)||w2{ņ) remains bounded as t+ 0°. To 
see this, remember that |PAz|| and ||u|| are of 
essentially the same order. Thus the term ||z;(t)||* on 
the left-hand side of [34] can be accompanied by a 
term ||1(t)|| 2/9): 

Finally, to prove that u(t) — uo strongly in W3(Q), 
we need only show that lim sup, o ||PAm(t)|| € 
|PAuo| since |PA-|| and ||:|yz(y, are equivalent 
norms on J,(Q) à W3(Q). To this end, multiply [46] 
by A;dc;,/dt and sum to get 


1 d n n n n H 

54; I Au [+| Ve Ij = (u” - Vu", PAu?) 

(u” - Vu", PAu") 

— (ur - Vu" +u” - Vus, PAu") 


&| a 


Integrating this gives 
PAu” (2) — | PA (0) | 
f Gipauni? as 
— Jo dt 
= 2(u" - Vu", PAu")|, — 2(u" - Vu", PAw")|g 


t t 
-2 | Ive]! as -2 | (uy - Vu” 
0 J0 


+u” - Vu", PAu”) ds [53] 


For the terms under the last integral we have 
| (ay - Vu", PAu”) + (u” - Vu; , PAu”) | 
< || Veer | +e] Var"! PAu"? 
Therefore, [53] implies 


| PAU” (0) 1ó< PA" (0) | 2G" - Vu”, PAu"), 
— 2(u" - Vu", PAu”)|, + Kt 


uniformly in n, as £— 0*, where K is a constant 
depending on the estimates [32] and [34]. Letting 
n — oo, gives 


||PAu(t)||"< [PAu)| -2(- Vu, PAu)|, 
— 2(u - Vu, PAu)|, + Kt 


Since 4- Vu — ug: Vuo strongly in L7(Q), and 
PAu-— PAuo weakly in L^(Q), we get the desired 
result. The continuous assumption of the initial values 
in W3(Q) also implies their continuous assumption in 
the classical sense, and hence that u € C(Q x [0, T)). 


Conclusion 


Years ago, mathematical questions concerning the 
Navier-Stokes equations were usually considered in 
the context of generalized or weak solutions, which was 
a technical barrier to many in the scientific community. 
Nowadays, realizing that solutions are at least locally 
classical, fundamental questions such as that of global 
regularity can be studied within the classical context. If 
the estimate [29] is proved for classical solutions, with 
T = œ, and without a restriction on the size of the data, 
this particular matter will be settled. 


Acknowledgment 


This work has been supported by the Natural Sciences 
and Engineering Research Council of Canada. 


See also: Compressible Flows: Mathematical Theory; 
Elliptic Differential Equations: Linear Theory; 
Incompressible Euler Equations: Mathematical Theory; 
Interfaces and Multicomponent Fluids; Leray-Schauder 
Theory and Mapping Degree; Non-Newtonian Fluids; 
Partial Differential Equations: Some Examples; 
Stochastic Hydrodynamics; Turbulence Theories; 
Wavelets: Application to Turbulence. 


Further Reading 


Beirão da Veiga H (1997) A new approach to the L?-regularity 
theorem for linear stationary nonhomogeneous Stokes sys- 
tems. Portugaliae Mathematica 54(Fasc. 3): 271-286. 

Cattabriga L (1961) Su un probleme al contorno relativo al 
sistema di equazioni di Stokes. Rendi conti del Seminario 
Matematico della Universita di Padova 31: 308-340. 


von Neumann Algebras: Introduction, Modular Theory, and Classification Theory 379 


Heywood JG (1976) On uniqueness questions in the theory of 
viscous flow. Acta Mathematica 136: 61-102. 

Heywood JG (1980) The Navier-Stokes equations: on the 
existence, regularity and decay of solutions. Indiana Univer- 
sity Mathematics Journal 29: 639-681. 

Heywood JG (1990) Open problems in the theory of the 
Navier-Stokes equations for viscous incompressible flow. In: 
Heywood JG, Masuda K, Rautmann R, and Solonnikov VA 
(eds.) The Navier-Stokes Equations: Theory and Numerical 
Methods, Lecture Notes in Mathematics, vol. 1431, pp. 1-22. 
Berlin-Heidelberg: Springer Verlag. 

Heywood JG (1994) Remarks concerning the possible global 
regularity of solutions of the three-dimensional incompressible 
Navier-Stokes equations. In: Galdi GP, Malek J, and Necas J 
(eds.) Progress im Theoretical and Computational Fluid 
Dynamics, Pitman Research Notes in Mathematics Series, 
vol. 308, pp. 1-32. Essex: Longman Scientific and Technical. 

Heywood JG (2001) On a conjecture concerning the Stokes 
problem in nonsmooth domains. In: Neustupa J and Penel P 
(eds.) Mathematical Fluid Mechanics: Recent Results and 
Open Problems, Advances in Mathematical Fluid Mechanics 
vol. 2, pp. 195-206. Basel: Birkhauser Verlag. 

Heywood JG (2003) A curious phenomenon in a model problem, 
suggestive of the hydrodynamic inertial range and the smallest 
scale of motion. Journal of Mathematical Fluid Mechanics 5: 
403-423. 

Heywood JG and Rannacher R (1982) Finite element approxima- 
tion of the nonstationary Navier-Stokes problem, Part I: 
Regularity of solutions and second-order error estimates for 
the spatial discretization. SIAM Journal on Numerical 
Analysis 19: 275-311. 


Heywood JG and Xie W (1997) Smooth solutions of the vector 
Burgers equation in nonsmooth domains. Differential and 
Integral Equations 10: 961—974. 

Ladyzhenskaya OA (1969) The Mathematical Theory of Viscous 
Incompressible Flow, 2nd edn. New York: Gordon and 
Breach. 

Prodi G (1962) Teoremi di tipo lacale per il sistema de Navier- 
Stokes e stabilita delle soluzione stazionarie. Rendi conti del 
Seminario Matematico della Universita di Padova 32: 
374-397. 

Solonnikov VA (1964) On general boundary-value problems for 
elliptic systems in the sense of Douglis-Nirenberg. I. Izvestiya 
Akademii Nauk SSSR, Sariya Matematicheskaya 28: 665—706. 

Solonnikov VA (1966) On general boundary-value problems for 
elliptic systems in the sense of Douglis—Nirenberg. Il. Trudy 
Matematiceskogo Instituta Imeni V.A. Steklova 92: 233-297. 

Solonnikov VA and Séadilov VE (1973) On a boundary value 
problem for a stationary system of Navier-Stokes equations. 
Proceedings. Steklov Institute of Matematics 125: 186-199. 

Xie W (1991) A sharp pointwise bound for functions with L? 
Laplacians and zero boundary values on arbitrary three- 
dimensional domains. Indiana University Mathematics Journal 
40: 1185-1192. 

Xie W (1992) On a three-norm inequality for the Stokes operator in 
nonsmooth domains. In: Heywood JG, Masuda K, Rautmann R, 
and Solonnikov VA (eds.) The Navier-Stokes Equations II: 
Theory and Numerical Methods, Lecture Notes in Mathematics, 
vol. 1530, pp. 310-315. Berlin-Heidelberg: Springer Verlag. 

Xie W (1997) Sharp Sobolev interpolation inequalities for the 
Stokes operator. Differential and Integral Equations 10: 
393—399. 


| von Neumann Algebras: Introduction, Modular Theory, 


and Classification Theory 


- VS Sunder, The Institute of Mathematical Sciences, 
Chennai, India 


- © 2006 Elsevier Ltd. All rights reserved. 


Introduction 


von Neumann algebras, as they are called now, 
first made their appearance under the name 
“rings of operators" in a series of seminal papers — 
see Murray and von Neumann (1936, 1937, 1943) 
and von Neumann (1936) — by F J Murray and J von 
Neumann starting in 1936. Murray and von 
Neumann (1936) specifically cite “attempts to 
generalize the theory of unitary group representa- 
tions” and “demands by various aspects of the 
quantum-mechanical formalism” among the reasons 
for the elucidation of this subject. 

In fact, the simplest definition of a von Neumann 
algebra is via unitary group representations: 
a collection M of continuous linear operators on a 
Hilbert space H (in order to avoid some potential 
technical problems, we shall restrict ourselves to 


separable Hilbert spaces throughout this article) is 
a von Neumann algebra precisely when there is a 
representation p of a group G as unitary operators 
on H such that 


M= (x € L(H) : xp(t) = p(t)x Vt € G} 


As above, we shall write £(H) for the collection of 
all continuous linear operators on the Hilbert space H; 
recall that a linear mapping x:71— H is continuous 
precisely when there exists a positive constant K such 
that ||xé|| < K\|€|| VE € H. If the norm ||x|| of the 
operator x is defined as the smallest constant K with 
the above property, then the set L(H) acquires the 
structure of a Banach space. In fact L(H) is a Banach 
*-algebra with respect to the composition product, and 
involution x — x* given by 


(x6, 0) = (€,x*n) VE, y € H 


The first major result in the subject is the 
remarkable “double commutant theorem," which 
establishes the equivalence of a purely algebraic 
requirement to purely topological ones. We need 
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two bits of terminology to be able to state the 
theorem. 
First, define the commutant S’ of a subset $ C 


L(H) by 
S = {x E L(H) : xx = xx’ Vx eS) 


Second, the strong (resp., weak) operator topology is 
the topology on L(H) of “pointwise strong (resp., 
weak) convergence": that is, x, — x precisely when 
lx.£ — xE — O VE € H (resp., (xs£ — xE, n) — 0 VE, 
n € H). 


Theorem 1 The following conditions on a unital 
*-subalgebra M of L(H) are equivalent: 


(i) M — M"(- (M')). 
(ii) M is closed in the strong operator topology. 
(iii) M is closed in tbe weak operator topology. 


The conventional definition of a von Neumann 
algebra is that it is a unital *-subalgebra of L(H) 
which satisfies the equivalent conditions above. The 
equivalence with our earlier *simplest definition" is 
a consequence of the double commutant theorem 
and the fact that any element of a von Neumann 
algebra is a linear combination of four unitary 
elements of the algebra: simply take G to be the 
group of unitary operators in M’. 

Another consequence of the double commutant 
theorem is that von Neumann algebras are closed 
under any “canonical construction.” For instance, 
the uniqueness of the spectral measure Er>P,(E) 
associated to a normal operator x shows that if u is 
unitary, then P4, (E) — 4P,(E)u* for all Borel sets E. 
In particular, if x €M and w €U(M', then 
u' P,(E)u"* = Pyxy(E)=P,(E), and hence, we may 
conclude that P,(E) € U(M'Y =(M'Y —M (we will 
write U(N) (resp., P(N)) to denote the collection of 
unitary (resp., projection) operators in any von 
Neumann algebra N); that is, if a von Neumann 
algebra contains a normal operator, it also contains 
all the associated spectral projections. This fact, 
together with the spectral theorem, has the conse- 
quence that any von Neumann algebra M is the 
closed linear span of (M). 

The analogy with unitary group representations is 
fruitful. Suppose then that M — p(G)', for a unitary 
representation of G. Then the last sentence of the 
previous paragraph implies that p(G)' =C precisely 
when there exist no nontrivial p-stable subspaces 
(here and in the sequel, we identify C with its image 
under the unique unital homomorphism of C into 
L(H); and we reserve the symbol Z(M) to denote the 
center of M), that is, when p is irreducible. In general, 
the p-stable subspaces are precisely the ranges of 
projection operators in M. The notion of unitary 


equivalence of subrepresentations of p is seen to 
translate to the equivalence defined on the set P(M) 
of projections in M, whereby p ~q if and only if 
there exists an operator u € M such that u*u =p and 
uu* =q. (Such a u is called a partial isometry, with 
“initial space" = range p, and “final space" = range q.) 
This is the definition of what is known as the 
*Murray-von Neumann equivalence rel M" and is 
denoted by ~y . The following accompanying defini- 
tion is natural: if p,q € P(M), say p <m q if there 
exists po € P(M) such that p ~m po € q — where of 
course e < f Srange(e) C range(f ). 


The Murray-von Neumann 
Classification of Factors 


We start with a fact (whose proof is quite easy) and 
a consequent fundamental definition. 


Proposition 2 Tbe following conditions on a von 
Neumann algebra M are equivalent: 


(i) for any p,q € P(M), it is true that either p <m q 
or q SM p. 
(ii) ZUM) C Mn M! =C. 


The von Neumann algebra M is called a “factor” if 
it satisfies the equivalent conditions above. 


The alert reader would have noticed that if G is 
a finite group, then p(G)’ is a factor precisely when 
the representation p is “isotypical.” Thus, the 
“representation-theoretic fact," that any unitary 
representation is expressible as a direct sum of 
isotypical subrepresentations, translates into the 
“von Neumann algebraic fact” that any *-subalgebra 
of L(H) is isomorphic, when H is finite dimensional, 
to a direct sum of factors. In complete generality, 
von Neumann (1949) showed that any von 
Neumann algebra is expressible as a “direct integral 
of factors." We shall interpret this fact from 
"reduction theory" as the statement that all the 
magic/mystery of von Neumann algebras is con- 
tained in factors and hence restrict ourselves, for a 
while, to the consideration of factors. 

Murray and von Neumann initiated the study of a 
general factor M via a qualitative as well as a 
quantitative analysis of the relation <m on P(M). 
First, call a p € P(M) infinite if there exists a po € p 
such that p ~m po and po Æ p; otherwise, say p is 
finite. They obtained an analog, called the *dimen- 
sion function," of the Haar measure, as follows. 


Theorem 3 


(i) With M as above, there exists a function 
Dm:P(M)— [0,00] which satisfies the following 
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properties, and is determined up to a multiplicative 
constant, by them: 


e p xm 9 Dm (p) € Dy(q) 

e p is finite if and only if Duílp) < oo 

e If (p,:n —1,2,...] is any sequence of pairwise 
orthogonal projections in  P(M) and 
P= } n Pn, then Dy(p) =>, Du (Pn) 


(11) M falls into exactly one of five possible cases, 
depending on which of the following sets is the 
range of some scaling of Dy: 


@ (1, 10,1, 25.0050] 
e (1) (0,1,2,..., 00) 
e (111) [0,1] 

e (II) [0, co] 

e (111) (0,00) 


In words, we may say that a factor M is of: 


1. type I (ie. of type I, for some 1 € » € oc) 
precisely when M contains a minimal projection, 

2. type II (i.e., of type II, or Ha) precisely when M 
contains nonzero finite projections but no mini- 
mal projections, and 

3. type III precisely when M contains no nonzero 
finite projections. 


Examples L*(O,4) may be regarded as a von 
Neumann algebra acting on L*(Q,u) as multi- 
plication operators; thus, if we set my(€)=ff£, 
then m:f — m; defines an isomorphism of L*(Q, yu) 
onto a commutative von Neumann subalgebra of 
£(L?(Q, y). In fact, “up to multiplicity," this is how 
any commutative von Neumann algebra looks. 


It is a simple exercise to prove that M C L(H) is a 
factor of type I,, 1 € n < oo, if and only if there exist 
Hilbert spaces H,, and K and identifications H =H, Q 
K, M = {x Q idx:x € L(H,)} where dim 74, — 5; and 
so M & L(H,). 

To discuss examples of the other types, it will be 
convenient to use “crossed products" of von 
Neumann algebras by ergodically acting groups of 
automorphisms. We shall now digress with a 
discussion of this generalization of the notion of a 
semidirect product of groups. 

If a: G — Aut(M) is an action of a countable group 
G on M, where MC £(H) is a von Neumann 
algebra, and H=(G,H), there are representations 
7T:M — £(H) and A: G —>U(L(H)) defined by 


(A(t)E)(s) = EE's) 


These representations satisfy the commutation rela- 
tion A(t)z(x)A(t !) = z(a;(x)), and the crossed pro- 
duct Mx,,G is the von Neumann subalgebra of L(H) 
defined by M = (x(M) U A(G))". 


Let us restrict ourselves to the case of 
M=L*(9, u) acting on L^(Q, jj). In this case, it is 
true that any automorphism of M is of the form 
f —^foT^, where T is a “nonsingular transforma- 
tion of the measure space (Q, u)? (=a bijection 
which preserves the class of sets of ¡u-measure 0). So, 
an action of G on M is of the form a;(f)=f o T;', 
for some homomorphism £— T, from G to the 
group of nonsingular transformations of (Q, y). We 
have the following elegantly complete result from 
Murray and von Neumann (1936). 


Theorem 4 Let M,G,a be as in the last section, 
and let M — M»,4G. Assume tbe G-action is “free,” 
meaning that if t#1€G, then p({weEn: 
T;(w) =w}) =0. Then: 


(i) M is a factor if and only if G acts ergodically on 
(Q, u) — meaning that the only G-fixed functions 
in M are tbe constants. 

(ii) Assume that G acts ergodically. Then the type of 
the factor M is determined as follows: 


e M is of type I or II if and only if there exists 
a G-invariant measure v wbich is mutually 
absolutely continuous with respect to y, 
meaning v(E)=0 & u(E)=0; (the ergodicity 
assumption implies that such a v is necessa- 
rily unique up to scaling by a positive 
constant;) 

e M is of type I, precisely when tbe v as above is 
totally atomic, and Q is the disjoint union of n 
atoms for v; 

e M is of type II precisely when tbe v as above is 
nonatomic; 

e M is of finite type — meaning that 1 is a finite 
projection in M - precisely when the v as 
above is a finite measure; 

e M is of type III if and only if tbere exists no v 
as above. 


Thus, we get all the types of factors by this 
construction; for instance, we may take: 


(L,)G — Z, acting on Q— Z, by translation, and 
=v = counting measure 

(l44,)G —Z acting on —Z by translation, and 
=v — counting measure 

(II43)G 2 Zacting on Q—T-(zeC:|z| 21) by 
powers of an aperiodic rotation, and p=v= 
arclength measure 

(II4,) G— Qacting on Q=R by translations, and 
jt — v = Lebesgue measure 

(III)G — ax + b group acting in the obvious manner 
on 2=R,u~=v= Lebesgue measure. 


Such crossed products of a commutative von 
Neumann algebra by an ergodically acting countable 
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group were intensively studied by Krieger (1970, 
1976). We shall simply refer to such factors as 
“Krieger factors.” The term “Krieger factor” is 
actually used for factors obtained from a slightly 
more general construction, with ergodic group 
actions replaced by more general ergodic equiva- 
lence relations. Since there is no difference in the 
two notions at least in good (amenable) cases, we 
will say no more about this. 


Abstract von Neumann Algebras 


So far, we have described matters as they were in 
von Neumann’s time. To come to the modern era, it 
is desirable to “free a von Neumann algebra from 
the ambient Hilbert space” and to regard it as an 
abstract object in its own right which can act on 
different Hilbert spaces — for example, L*(Q, yu) is 
an object worthy of study in its own right, without 
reference to L7(Q, yu). 

The abstract viewpoint is furnished by a theorem 
of Sakai (1983); let us define an abstract von 
Neumann algebra to be an abstract C*-algebra 
(this is a Banach algebra with an involution related 
to the norm by the so-called C*-identity ||x||" = 
\|x*x||) M which admits a pre-dual M, — i.e., M is 
isometrically isomorphic to the Banach dual space 
(M,)'. It turns out that a predual of such an abstract 
von Neumann algebra is unique up to isometric 
isomorphism. Consequently, an abstract von 
Neumann algebra comes equipped with a canonical 
“weak*-topology,” usually called the “o-weak topol- 
ogy” on M. The natural morphisms in the category 
of abstract von Neumann algebras are *-homo- 
morphisms which are continuous with respect to 
c-weak topologies on domain and range. It is 
customary to call a linear map between abstract 
von Neumann algebras “normal” if it is continuous 
with respect to o-weak topologies on domain and 
range. 

The equivalence of the “abstract” definition of 
this section, with the “concrete” one given earlier 
(which depends on an ambient Hilbert space), relies 
on the following four facts: 


1. L(H) is an abstract von Neumann algebra, with 
the predual L(H), being the so-called “trace class” 
of operators, equipped with the “trace norm.” 

2. A self-adjoint subalgebra of L(H) is closed in the 
strong operator topology, and is hence a “con- 
crete von Neumann algebra” precisely when it is 
closed in the o-weak topology on L(H). 

3. If M is an abstract von Neumann algebra, and N 
is a *-subalgebra of M which is closed in the 
c-weak topology of M, then N is also an abstract 


von Neumann algebra, with one candidate for N, 
being M,/N, (where N, ={p € M,:n(p)= 
0 Vn € NJ). 

4. Any abstract von Neumann algebra (with separ- 
able predual) is isomorphic (in the category of 
abstract von Neumann algebras) to a (concrete) 
von Neumann subalgebra of L(H) (for a separ- 


able H). 


With the abstract viewpoint available, we shall 
look for modules over a von Neumann algebra M, 
meaning pairs (71, 7) where 7: M — L(H) is a normal 
*-homomorphism. 

A brief digression into the proof of fact (4) 
above — which asserts the existence of faithful 
M-modules — will be instructive and useful. Suppose 
M is an abstract von Neumann algebra. A linear 
functional ó on M is called a normal state if: 


e (positivity) ó(x'x) > Ovx € M; 
e (normality) 4: M — C is normal; and 
e (normalization) ó(1) — 1. 


(Normal states on L*(Q, u) correspond to non- 
negative probability measures on Q which are 
absolutely continuous with respect to p.) It is true 
that there exist plenty of normal states on M. 
In fact, they linearly span M,. This implies that if 
M, is separable, then there exist normal states 
on M which are even "faithful" — meaning 
oxx =0 x =0. 

Fix a faithful normal state ó on M. (Consistent 
with our convention about separable ?£'s, we shall 
only consider M's with separable preduals.) The 
well-known *Gelfand-Naimark-Segal" construction 
then yields a faithful M-module which is usually 
denoted by L^(M, ó) — motivated by the fact that if 
M —L*(9, u), and o(f)= [fdv, with v a probabil- 
ity measure mutually absolutely continuous with 
respect to u, then L*(M, 9) = L^(Q, v) with L*(Q, ji) 
acting as multiplication operators. The construction 
mimics this case: the assumptions on ó ensure that 
the equation 


(x. y) = ó(y x) 


defines a positive-definite inner product on M; let 
L^(M,$) be the Hilbert space completion of M. It 
turns out that the operator of left-multiplication by 
an element of M extends as a bounded operator to 
L^(M, à), and it then follows easily that L^(M, à) is 
indeed a faithful M-module, thereby establishing 
fact (4) above. 

Since we wish to distinguish between elements of 
the dense subspace M of L*(M, ó) and the operators 
of left-multiplication by members of M, let us write 
X for an element of M when thought of as an 
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element of L7(M,¢@), and x for the operator of left- 
multiplication by x; thus, for instance, € —x1, and 
xy =x, xl, 1) = d(x), etc. 


Modular Theory 


While type II factors were more or less an enigma 
at the time of von Neumann, all that changed with 
the advent of Connes. The first major result of this 
“type III era" is the celebrated “Tomita—Takesaki 
theorem" (cf. Takesaki (1970)), which views the 
adjoint mapping on M as an appropriate operator 
on L^(M, à), and analyzes its polar decomposition. 
Specifically, we have: 


Theorem 5 If à is any faithful normal state on M, 
consider the densely defined conjugate-linear opera- 
tor given, with domain {x:x € M}, by $E) sd. 
Then, 


(1) there is a unique conjugate-linear operator 
Ss (the “closure of - whose graph is 
the closure of the graph of 5m if we write 

$4 = Joy "^ for the polar decomposition of the 
conjugate-linear closed operator S5, then 
(ii) Jo is an antiunitary involution on L^(M, ¢) (i.e., 
it is a conjugate-linear norm-preserving bijec- 
tion of L?(M,@) onto itself which is its own 
Inverse); 

(iii) Ay is an injective positive self-adjoint operator 
on L*(M, à) such that Jf (A4)]; fU) for all 
Borel functions f : R —^ R, and most crucially 


(1v) 
Jo M], = M' and AMA," —MvteR 


(Here and elsewhere, we shall identify x € M 
with the operator of “left-multiplication by x” 
on L?(M, 6). 


Thus, each faithful normal state ó on M yields a 
one-parameter group (o; :t € R} of automorphisms 
of M - referred to as the group of “modular 


automorphisms” — given by 
o — Alt —it 
Ot (x) mu A XA, 


The extent of dependence of the modular group on 
the state is captured precisely by Connes’ Radon 
Nikodym theorem (Connes 1973), which shows that 
the modular groups associated to two different 
faithful normal states are related by a “unitary 
cocycle in M." This has the consequence that if 
c: Aut(M) — Out(M) = Aut(M)/Int(M) is the quoti- 
ent mapping — where Int(M) denotes the normal 
subgroup of inner automorphisms given by unitary 


elements of M — then the one-parameter subgroup 
(c(o?) :t € R} of Out(M) is independent of 9. 


Connes' Classification and 
Injective Factors 


Given a factor M, Connes defined 
S(M) = 
(Y spec(A,) : @ a faithful normal state on M} 


which is obviously an isomorphism invariant. He 
then classified (Connes 1973) type III factors into a 
continuum of factors: 


Theorem 6 Let M be a factor. Then, 


(i) 0€ S(M) & M is of type III; and 
(ii) if M is a type III factor, there are three mutually 
exclusive and exbaustive possibilities: 


e (Ill59)S(M) — 10, 1} 
e (III,JS(M) = (0] U A7, for some 0 <A< 1 
e (111,)S(M) = [0, oo) 


oo 


Example 7 Consider the compact group 2= [[;. , 
G, where G, is a finite cyclic group of order v, for 
each n. Let p= [[7.., Hn, where p, is a probability 
measure defined on the subsets of G, which assigns 
positive mass to each singleton. Let G= &* , G, be 
the dense subgroup of Q consisting of finitely 
nonzero sequences. It is not hard to see that each 
translation T,,g € G (given by T,(w)=g+w) is a 
nonsingular transformation of the measure space 
(Q, u). The density of G in Q shows that this action 
of G on L®(Q, y) is fixed-point-free and ergodic, 
with the result that the crossed product L®(Q, jj): G 
is a factor. 

Krieger showed that in the case of a Krieger factor 
M = L*(Q, 1) 4G, the invariant S(M) agrees with the 
so-called *asymptotic ratio set" of the group G of 
nonsingular transformations, which is computable 
purely in terms of the Radon-Nikodym derivatives 
d(uo T;)/dp. Using this ratio set description, it is 
not hard to see that the Krieger factor M given by 
the infinite product Q 


e isa factor of type III, if vp = 2 and 00] = A/(1 + A) 
for all zz 

e is a factor of type III, if v,=2 and j5,(0] — 
A/(1 +A), p2n+110) — &/(12- &), for all s, pro- 
vided that (4, 4] generates a dense multiplicative 
subgroup of R7; 

e can be of type IIo. 


Among all factors, Connes identified one tractable 
class — the so-called injective factors — which are 
ubiquitous and amenable to classification. To start 


384 von Neumann Algebras: Introduction, Modular Theory, and Classification Theory 


with, he established the equivalence of several 
(seemingly quite disparate) requirements on a von 
Neumann algebra M C L(H) — ranging from injec- 
tivity (meaning the existence of a projection of norm 
1 from L(H) onto M) to “approximate finite 
dimensionality” (meaning M=(U, A)” for some 
increasing sequence A; C A2 C- CA, C: of 
finite-dimensional x-subalgebras). In the same 
paper, Connes (1976) essentially finished the com- 
plete classification of injective factors. Only the 
injective III; factor withstood his onslaught; but 
eventually even it had to surrender to the technical 
virtuosity of Haagerup (1987) a few years later! 

In the language we have developed thus far, the 
classification of injective factors may be summarized 
as follows: 


e Every injective factor is isomorphic to a Krieger 
factor. 

e Up to isomorphism, there is a unique injective 
factor of each type with the solitary exception of 
IIo. 

e Injective factors of type Io are classified (up to 
isomorphism) by an invariant of an ergodic- 
theoretic nature called the “flow of weights”; 
unfortunately, coming up with a crisp description 
of this invariant, which is simultaneously acces- 
sible to the nonexpert and is consistent with the 
stipulated size of this survey, is beyond the scope 
of this author. 


The interested reader is invited to browse through 
one of the books (Connes 1994, Sunder 1986, 
Dixmier 1981) for further details; the third book is 
the oldest (a classic but the language has changed a 
bit since it was written), the second is more recent 
(but quite sketchy in many places), and the first is 
clearly the best choice (if one has the time to read it 
carefully and digest it). Alternatively, the interested 
reader might want to browse through the encyclo- 
pediac treatments (Kadison and Ringrose) or 
(Takesaki). 


See also: Algebraic Approach to Quantum Field Theory; 
Bicrossproduct Hopf Algebras and Noncommutative 
Spacetime; Braided and Modular Tensor Categories; 
C*-Algebras and Their Classification; Ergodic Theory; 
Finite-Type Invariants; Hopf Algebra Structure of 


Renormalizable Quantum Field Theory; Hopf Algebras 
and g-Deformation Quantum Groups; The Jones 
Polynomial; Knot Theory and Physics; Noncommutative 
Geometry and the Standard Model; Noncommutative 
Tori, Yang-Mills and String Theory; Positive Maps on 
C*-Algebras; Quantum 3-Manifold Invariants; Quantum 
Entropy; Tomita—Takesaki Modular Theory; 
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Introduction 


Subfactor theory was initiated by Jones (1983) and 
has experienced rapid progress beyond the frame- 
work of operator algebras. Here we start with a 
basic introduction in this section. 

A factor is a von Neumann algebra with a trivial 
center. A von Neumann algebra M is an algebra of 
bounded linear operators on a Hilbert space H, 
which contains the identitiy operator and is closed 
under the *-operation and weak operator topology, 
and its center is the intersection of M and its 
commutant 


M' = (x € B(H)|xy = yx for all y e M} 


where B(H) denotes the set of all the bounded linear 
operators on H. (We are mostly interested in 
separable, infinite-dimensional Hilbert spaces. A 
von Neumann algebra is automatically closed also 
in the norm topology and thus it is also a C*-algebra.) 
By definition, a factor M acts on a certain Hilbert 
space H, but we also consider its action on another 
Hilbert space K, that is, a o-weakly continuous 
homomorphism preserving the *-operation from M 
into B(K). A subfactor is a factor N which is 
contained in another factor M and has the same 
identity. A factor is classified into types 
Lo (= 1,2, 3, Lo, Uy. Dos, and TH, Itt most af 
the interesting studies of subfactors, the two factors 
are of both type II; or both type III. A factor M is 
said to be of type Il, if it is infinite dimensional 
and has a finite trace tr: M— C. By definition, a 
finite trace tr is a linear functional on M satisfying 
tr(1) — 1, tr(xy)=tr(yx) for all x,y€M, and 
tr(x*x) > 0 for all x € M. When a factor M, not 
isomorphic to C, acts on a separable Hilbert space, it 
is of type III if and only if for any two nonzero 
projections p,q € M, we have an operator v € M 
with vv* — p and v*v =q. One obviously cannot have 
a trace on such a factor. (See Takesaki (2002, 2003) 
for a general theory on factors.) 

Let M be a type Il, factor acting on a Hilbert 
space H. We then have the coupling constant of 
Murray and von Neumann, which is denoted by 
dimmH and belongs to (0,oc]. This measures the 
relative dimension of H with respect to M. Note 
that the factor M acts on M itself by the left 
multiplication. We introduce an inner product on 
M by (x, y) =tr(y*x) and denote the completion by 


L^(M). Then M acts on this Hilbert space and we 
have dimmL*(M)=1. 

Let N C M be a subfactor and suppose that both 
N and M are of type Il;. (We then simply say that 
N C M is a type Il, subfactor.) Suppose that M acts 
on a Hilbert space H with dimyH < oc. Then we 
define the Jones index of N in M by 


This number is independent of the choice of H, as 
long as dimyH < oo, so we can take H = L?(M), 
then we have [M:N|]=dimyL?(M). The equality 
[M:N]=1 means M =N. The first major discovery 
of Jones (1983) is that the value of the Jones index is 
in the set 


{4 cos*(1/m)|m = 3,4,5,...)U [4,00] [1] 


and all the values in this set are indeed realized. 

Suppose we have a II, factor M and an action of 
an at most countable, discrete group G on M, that 
is, a homomorphism a: G — Aut(M), where Aut(M) 
is the automorphism group of M. Then we have a 
construction Mx,G, called the crossed product. If 
ag is not an inner automorphism of M for any g € G 
other than the identity element of G, then Mx,G is 
also a type II, factor. (An automorphism 7 of M is 
said to be inner if it is of the form z(x) — uxu* for 
some unitary operator u € M.) The index of a 
subfactor M C Mx,G is the order of G, which can 
be infinite. If we have a subgroup H of G, then we 
obtain a subfactor Mx,H C M™,G and its index is 
given by the index [G : H] of the subgroup H. This 
analogy to the index of a subgroup is the origin of 
the terminology of the Jones index for a subfactor. 
The Jones index is also analogous to the degree of 
an extension of a field. From the viewpoint of this 
analogy, subfactor theory can be regarded as a 
certain generalized analogue (or the “quantum” 
version) of the classical Galois theory for field 
extensions. (The direct analog of the classical Galois 
correspondence for subfactors was studied by 
Nakamura-Takeda in the early days, and Izumi- 
Longo-Popa gave the most general form.) 

The tools Jones (1983) has introduced to study 
subfactors are as follows. Let N C M be a subfactor of 
type Il, with finite Jones index. We consider the 
actions of N, M on L^(M). The completion of N with 
respect to the inner product given by the trace gives 
L^(N), which is naturally regarded as a closed 
subspace of L^(M). Let ew be the projection on 
L?(M) onto L^(N), which is called the Jones 
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projection. We define M, to be the von Neumann 
algebra generated by M and ey on L^ (M). This is again 
a type II; factor and denoted by Mı. This construction 
is called the basic construction. We obtain [M; : M] = 
[M:N]. Repeat the same procedure for M c Mj 
acting on L^(Mi) this time. In this way, we have an 
increasing sequence of type II, factors, 


NCMCMicM5cMsac--- 


which is called the Jones tower. We label the 
corresponding Jones projections as e; — ex, e2 = ew, 
e3=em,, .... We then have the following celebrated 
Jones relations: 
eje, — e,e, if |j—k|>1 
1 
[M : N] ^j 


€j€j-] €; = i2] 
Jones proved the above-mentioned restriction on 
the possible values of the Jones index using these 
relations. The realization of the index values below 
4 in the set [1] by Jones also relies on these 
relations of the Jones projection. The basic con- 
struction is also possible for the other direction. 
That is, we can construct a subfactor Ni C N so 
that N C M is the basic construction of N; CN. 
This is called the downward basic construction. 
This Ni is not unique, but is unique up to an inner 
automorphism of N. 

A subfactor N C M is said to be irreducible if the 
relative commutant N''M is equal to C. If a 
subfactor has Jones index less than 4, then it is 
automatically irreducible. The original realization of 
the Jones index values above 4 by Jones was through 
reducible subfactors. Popa proved that all the values 
above 4 are realized with irreducible subfactors. A 
factor is said to be hyperfinite if it has a dense 
subalgebra given as the union of increasing sequence 
of finite-dimensional x-algebras. If M is a hyperfinite 
type II; factor, then its subfactor is automatically 
also hyperfinite by a deep theorem of Connes. For 
hyperfinite, irreducible type II; subfactors, it is still 
an open problem to determine all the possible values 
of the Jones index. 

For type II; factors N C M C P, the Jones index 
[P: N] is equal to the product [P:M][M:N]. Thus for 
the Jones tower, we have [M,:N]=[M:N]**!. In 
general, if a subfactor N C M has a finite Jones 
index, then the relative commutant N'NM is 
automatically finite dimensional. So, if we start 
with a type Il; subfactor N C M with finite Jones 
index, we have an increasing sequence of finite- 
dimensional algebras as follows: 


N'OMcN'nMicN'nM;cN'nMsc-- [3] 


These finite-dimensional algebras are called higher 
relative commutants of NC M. We draw the 
Bratteli diagram for the higher relative commutants 
as follows. Consider N’MM, (with convention 
M =N, Mo— M) then it is a finite-dimensional 
*-algebra; thus, it is of the form CD; M,,(C), where 
we have only finitely many direct summands. We 
draw a dot for each summand. We similarly draw a 
dot for each summand in @, Mm, (C) for N' n M1. 
Let ¿ be the inclusion map from N’QM,= 
Ð; M,(C) to N'AN Mge Mm (C) and p, the 
identity of M,,(C), which is a projection in N'N 
M,,;. We denote by jj the multiplicity of the 
embedding map x — 1(x)p; from M,,(C) to M,,,(C). 
Then we draw u; edges from the jth dot for Ma (C) 
to the Ith dot for M,,,(C). We repeat this procedure 
for all k, and get a picture as in Figure 1, which is 
called the Bratteli diagram of the higher relative 
commutants of N C M. 

It turns out that the edges connecting the &th and 
(k + 1)th steps of the Bratteli diagram consist of the 
reflection of those connecting the (k — 1)th and kth 
steps, and a (possibly empty) new part. The *new" 
parts taken altogether in the above Bratteli diagram 
constitute the principal graph of a subfactor N c M. 
In the example of Figure 1, the principal graph is the 
Dynkin diagram A;. In general, a principal graph 
can be finite or infinite. If it is finite, we say that a 
subfactor is of finite depth. If a subfactor has the 
Jones index less than 4, it is automatically of 
finite depth and the principal graph must be one of 
the A-D-E Dynkin diagrams. 

Pimsner and Popa (1986) obtained the character- 
ization of the Jones index value in terms of the 
Pimsner-Popa inequality for a conditional expec- 
tation. This can be used as a definition of the index 
for a subfactor of arbitrary type (and even for 
C*-subalgebras). Kosaki obtained a definition of the 
index for type III subfactors based on works of 
Connes and Haagerup. 


N'AN 
NAM 
N'AM; 
N' ^ M, 
N'^ M3 


N'^M, 


Figure 1 The Bratteli diagram of the higher relative commutants. 


Analytic Classification Theory 


If M is a hyperfinite type II; factor, then it is unique 
up to isomorphism. So any subfactor of such M is 
isomorphic to M itself. We next consider the 
classification problem of hyperfinite type II; subfac- 
tors. We say that a subfactor N C M is isomorphic to 
PC Q if we have an isomorphism of M onto Q 
which maps N onto P. The following tower of finite- 
dimensional algebras is a natural invariant for a type 
II; subfactor N C M with finite Jones index and it is 
called the standard invariant for N C M: 


M'AM c MAM c MAM: c-- 
a N N [4] 
NAM c NAM c N'AM: C.e 


Each square 


MOM, c MOM, 
n N 
NOM, C N'A Mkaa 


is a special combination of inclusions called a 
commuting square. Under a fairly general condition 
(called extremality of a subfactor, which automati- 
cally holds for an irreducible subfactor), the above 
sequence [4] is anti-isomorphic to the following 
sequence of finite-dimensional algebras, including 
the trace values: 


M'AM c N'AM c NNM c-- 
N N N [5] 


MOM, c NOAM c NOM Cee 


where ---C N3 CN, CN; CNCM is given by 
repeated downward basic constructions. So, if the 
closure of LJ(N; A M1) in the weak operator topology 
is equal to M, for an appropriate choice of N;'s, 
then the closure of UN; A M) is also M, and the 
isomorphism class of the subfactor N C M is recov- 
ered from the standard invariant. In such a case, we 
say that a subfactor has a generating property, and 
then we have a complete classification of subfactors 
in terms of the standard invariant. Popa (1994) 
introduced a notion called strong amenability and 
proved that a subfactor of type Il, is strongly 
amenable if and only if it has the generating property. 
This is the fundamental result in the classification of 
subfactors. A hyperfinite type II; subfactor with finite 
Jones index and finite depth is automatically strongly 
amenable, so such a subfactor is covered by this 
classification theorem of Popa. Popa also has a 
similar result for subfactors of type III. 
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Constructions and Combinatorial 
Classification 


As mentioned in the above section, Jones con- 
structed hyperfinite type Il, subfactors for all 
possible index values below 4. They have the 
Dynkin diagrams A, as the principal graphs. It has 
been an important problem to construct new 
subfactors since then. Using the Hecke algebras, 
Wenzl constructed a series of subfactors with index 
values sin? (Nz/k)/ sin? (m/k) with N —2,3,4,..., 
where the series for N —2 coincide with the ones 
constructed by Jones. Wenzl's dimension estimate in 
this work for the relative commutant has been an 
important tool to study subfactors. It was soon 
noticed that the subfactors of Jones and Wenzl are 
related to the quantum groups U,(sly) of Drinfel'd- 
Jimbo, at the value of the deformation parameter q 
at exp (7i/k). Constructions of subfactors from other 
quantum groups have been given by Wenzl. 

Ocneanu (1988) has introduced a notion of a 
paragroup and characterized the higher relative 
commutants arising from a type II, subfactor with 
finite Jones index and finite depth as a paragroup. If 
we start with a subfactor N C Nx,G for a finite 
group G, the corresponding paragroup contains 
complete information on the group G and its 
representations. [n this sense, a paragroup is a 
generalization of a (finite) group. The basic idea is 
to regard the bimodule yL?*(M)y as an analog of the 
fundamental representation of a compact Lie group 
and make finite relative tensor products 


--« QN L^ (M) Sm L^ (M) 8n L*(M) Om --- 


Then one makes an irreducible decomposition and 
studies various intertwiners arising from these 
irreducible bimodules. In this way, we obtain a 
certain combinatorial object and it is called a 
paragroup. The vertices of the principal graphs 
correspond to irreducible bimodules and the edges 
correspond to basis vectors in the intertwiner spaces. 
Note that by Popa's theorem explained in the 
previous section, a classification of subfactors of a 
hyperfinite type II, factor with finite Jones index 
and finite depth is reduced to one of paragroups. 
Using this theory of paragroups, Ocneanu has 
found that the Dynkin diagrams A,, D2,, E6, and 
Es are realized as principal graphs of subfactors, but 
D5,:; and E; are not. Furthermore, each of the 
graphs A, and D», has unique realization and each 
of Es, Es has two realizations. At the index value 4, 
the principal graph must be one of the extended 
Dynkin diagrams, AU! ,, DM, EC), EU, EQ, Accs 
Axox, and D,, and all are realized. (The last 
three correspond to subfactors of infinite depth.) 
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See Evans and Kawahigashi (1998) and Goodman et 
al. (1989) for these constructions and classifications. 
Evans-Kawahigashi and Xu studied the orbifold 
construction of subfactors applied to the Hecke 
algebra subfactors of Wenzl. 

In a theory of integrable lattice models, we have 
squares with labeled edges, and we assign complex 
numbers to them. A paragroup has much formal 
similarity to such a lattice model, and the para- 
groups of subfactors of Jones and Wenzl correspond 
to the lattice models of Andrews-Baxter-Forrester. 

Goodman-de la Harpe-Jones have another con- 
struction of subfactors from the Dynkin diagrams, 
and for Eg this gives a hyperfinite type II; subfactor 
with Jones index 3 + V3 and finite depth. Haagerup 
has made a combinatorial study on type Il; sub- 
factors with Jones index values between 4 and 3 + 3 
and obtained a list of candidates of possible higher 
relative commutants. Haagerup himself showed one 
in the list with Jones index (5 + /13)/2 is indeed 
realized. Asaeda-Haagerup showed that another in 
the list having the Jones index (5 + V17)/2 is also 
realized. These two examples are still among the 
most mysterious examples of subfactors today and 
do not seem to arise from other constructions using 
quantum groups or conformal field theory. Izumi 
has another construction of a subfactor with the 
Jones index (7 + /29)/2. using an endomorphism of 
the Cuntz algebra. 

Popa has obtained a complete characterization of 
higher relative commutants including the case of 
infinite depth, and axiomatized the higher relative 
commutant as the standard A-lattices. Xu has 
constructed standard A-lattices, hence subfactors, 
from quantum groups. This realization of Popa of a 
given standard A-lattice produces a nonhyperfinite 
type Il, subfactor. Popa-Shlyakhtenko later showed 
that any standard A-lattice is realized for a subfactor 
of a single type Il; factor, a group Il, factor arising 
from the free group Fẹ having countably many 
generators, which is not hyperfinite. 

Jones (1999) has introduced a combinatorial 
characterization of standard A-lattices as planar 
algebras. This approach uses planar operads based 
on tangles and provides a new viewpoint on the 
structure of higher relative commutants. More 
studies on planar algebras have been done by 
Bisch—Jones. 


Topological Invariants in Three 
Dimensions and Tensor Categories 


Through the relations of the Jones projections, Jones 
(1985) discovered the Jones polynomial as an 


invariant for links. This was the beginning of series 
of entirely new theories in three-dimensional topol- 
ogy. The Jones polynomial was quickly generalized 
to the two-variable HOMFLY polynomial by Hoste, 
Ocneanu, Millet, Freyd, Lickorish, and Yetter. 

A three-dimensional topological quantum field 
theory (TQFT;) assigns a complex number to each 
closed oriented 3-manifold and a finite dimensional 
vector space to each closed oriented surface. 
Furthermore, to each compact oriented 3-manifold 
with boundary, it assigns a vector in the vector space 
corresponding to its boundary. Turaev-Viro have 
constructed TQFT, from combinatorial data called 
quantum 6j-symbols arising from quantum groups. 
Ocneanu has found that a subfactor of finite index 
and finite depth also produces quantum 6j-symbols, 
which give rise to a TQFT generalizing the Turaev- Viro 
construction. See Evans and Kawahigashi (1998) for 
this construction. Reshetikhin-Turaev have another 
construction of TQFT, from a modular tensor 
category, which is a braided tensor category with 
nondegenerate braiding. Ocneanu has found a 
subfactor version of the quantum double construc- 
tion which produces a modular tensor category 
from a type II; subfactor of finite index and finite 
depth. From a type Il; subfactor of finite index and 
finite depth, we can apply Ocneanu's generalization 
of the Turaev-Viro construction on one hand, and 
also the Reshetikhin-Turaev construction to the 
modular tensor category arising from the quantum 
double construction of Ocneanu. The resulting two 
TQFT;s are shown to be equal by Kawahigashi- 
Sato-Wakui. Concrete computations of these topo- 
logical invariants have been made by Sato-Wakui 
based on Izumi’s work. Turaev and Wenzl have 
other constructions of TQFT; and modular tensor 
categories. 


Algebraic Quantum Field Theory 


An operator algebraic approach to quantum field 
theory is called algebraic quantum field theory and 
the standard reference is Haag (1996). In this 
approach, instead of quantum fields which are 
operator-valued distributions, we consider a family 
{A(O)} of von Neumann algebras parametrized by 
spacetime regions O in a Minkowski space. Each 
A(O) is meant to be generated by self-adjoint 
operators which are observables in O. We axioma- 
tize such a family of von Neumann algebras and call 
one a local net of von Neumann algebras. It is 
enough to take O of a special form, called a double 
cone. The name “local” comes from the locality 
axiom which is a mathematical expression of the 
Einstein causality on a Minkowski space. The 


Poincaré group is used as the spacetime symmetry of 
the Minkowski space. Doplicher et al. (1971, 1974) 
have introduced a representation theory of a local 
net A of von Neumann algebras and found that a 
“physically nice” representation is realized as an 
endomorphism of a one von Neumann algebra A(O) 
for some fixed O. They have a notion of a statistical 
dimension for such a representation and it is an 
integer (or infinite) if the spacetime dimension is 
larger than 2. Longo (1989, 1990) has shown that 
this statistical dimension of a representation is equal 
to the square root of the index [A(O):A(A(O))], 
where A is the corresponding endomorphism of 
A(O) to the representation. The relation between 
algebraic quantum field theory and subfactor theory 
has been found in this way. Longo (1989, 1990) has 
also started a theory of canonical endomorphisms 
for a subfactor and Izumi has further studied it. 
Longo has later obtained a characterization when an 
endomorphism of a factor becomes a canonical 
endomorphism by introducing a O-system. 

Recently, conformal field theory has attracted 
much attention. An approach based on algebraic 
quantum field theory describes a conformal field 
theory with a local net of von Neumann algebras on 
a two-dimensional Minkowski space with diffeo- 
morphism group as the spacetime symmetry. We can 
restrict such a theory into a tensor product of two 
theories on the circle, the compactified one- 
dimensional Euclidean space. Each theory on the 
circle is called a chiral conformal field theory and 
described by a local conformal net of von Neumann 
algebras, which is a family of von Neumann 
algebras parametrized by intervals on the circle. 
The name “conformal” comes from the fact that we 
use the orientation preserving diffeomorphism group 
on the circle as the symmetry group of the space. For 
a local conformal net A of von Neumann algebras 
on the circle with natural irreducibility assumption, 
each von Neumann algebra A(I) is automatically a 
type III factor. The Doplicher-Haag-Roberts theory 
works in this setting after an appropriate adaptation 
as in Fredenhagen et al. (1989) and each representa- 
tion of a local conformal net of von Neumann 
algebras is realized by an endomorphism of A(I), 
where I is an arbitrarily fixed interval on the circle. 
(Here we do not need an assumption that a 
representation is "physically nice" since it now 
automatically holds.) Now the representations give 
a braided tensor category. 

Buchholz-Mack-Todorov constructed examples of 
local conformal nets of von Neumann algebras on the 
circle using the U(1)-current algebra. Wassermann 
(1998) has constructed more examples using positive 
energy representations of the loop groups LSU(N) 
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and computed their representation theory, and his 
construction has been extended to other Lie groups 
by Toledano Laredo and others. For the local 
conformal net A of von Neumann algebras on the 
circle arising from LSU(N), we take an endomorph- 
ism A of A(I) arising from a representation of the 
local conformal net, then we have a subfactor 
A(A(I) C A(I). This is isomorphic to the type Il; 
subfactor constructed by Jones and Wenzl tensored 
with a common type III factor. 

Longo-Rehren (1995) started the study of a local 
net of subfactors, A(I) C B(I). They have defined a 
certain induction procedure which gives a represen- 
tation of the larger local conformal net B from that 
of A. This procedure is today called a-induction. Xu 
has studied this procedure and found several basic 
properties. In the cases of local conformal nets of 
subfactors arising from conformal embeddings, he 
has found a simple construction of subfactors with 
principal graphs Eg and Eg using a-induction. 
In the context of subfactor theory, a-induction 
has been further studied by Bóckenhauer-Evans- 
Kawahigashi, together with graphical methods of 
Ocneanu on the Dynkin diagrams. More detailed 
studies on local conformal nets of factors on the 
circle have been pursued partly using various 
techniques of subfactor theory, including classifica- 
tion of local conformal nets of von Neumann 
algebras on the circle with central charge less than 
1 by Kawahigashi-Longo. 


See also: Algebraic Approach to Quantum Field Theory; 
Braided and Modular Tensor Categories; C'-Algebras 
and Their Classification; Hopf Algebras and 
q-Deformation Quantum Groups; The Jones Polynomial; 
Quantum 3-Manifold Invariants; Quantum Entropy; von 
Neumann Algebras: Introduction, Modular Theory, and 
Classification Theory; Yang-Baxter Equations. 
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Introduction 


A vortex is commonly associated with the rotating 
motion of fluid around a common centerline. It is 
defined by the vorticity in the fluid, which measures 
the rate of local fluid rotation. Typically, the fluid 
circulates around the vortex, the speed increases as 
the vortex is approached and the pressure decreases. 
Vortices arise in nature and technology applications 
in a large range of sizes, as illustrated by the 
examples given in Table 1. The next section presents 
some of the mathematical background necessary to 
understand vortex formation and evolution. Next, 
some sample flows are described, including impor- 
tant instabilities and reconnection processes. Finally, 
some of the numerical methods used to simulate 
these flows are presented. 


Background 


Let D be a region in three-dimensional (3D) space 
containing a fluid, and let x — (x,y,z)! be a point in 


Table 1 Sample vortices and typical sizes 


Vortex Diameter 
Superfluid vortices 10-8 cm (— 1 Å) 
Trailing vortex of Boeing 727 1-2m 

Dust devils 1-10m 
Tornadoes 10-500 m 
Hurricanes 100-2000 km 
Jupiter's Red Spot 25 000 km 


Spiral galaxies Thousands of light years 
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D. The fluid motion is described by its velocity 
u(x,t) — u(x,t)i + v(x,t)j + w(x,t)k, and depends on 
the fluid density p(x, t), temperature T(x, t), gravita- 
tional field g, and other external forces possibly 
acting on it. The fluid vorticity is defined by 9 = V x u. 
The vorticity measures the local fluid rotation about an 
axis, as can be seen by expanding the velocity near 
X — X0, 


u(x) = u(xo) + D(xo)(x — xo) 4- 5 (xo) x (x — xo) 
+ O(|x — xo|^) [1] 


where 


D(xo) = 


hi 
< 
2 
a 
< 
2 
i- 


Vau= | Uy ty H [2] 
Wi Wy Wz 


The first term u(xo) corresponds to translation: all 
fluid particles move with constant velocity u(xp). 
The second term Dí(xo)(x — xo) corresponds to a 
strain field in the three directions of the eigenvectors 
of the symmetric matrix D. If the eigenvalue 
corresponding to a given eigenvector is positive, 
the fluid is stretched in that direction, if it is 
negative, the fluid is compressed. Note that, in 
incompressible flow, V-u=0, so the sum of the 
eigenvalues of D equals zero. Thus, at least one 
eigenvalue is positive and one negative. If the third 
eigenvalue is positive, fluid particles move towards 
sheets (Figure 1a). If the third eigenvalue is negative, 
fluid particles move towards tubes (Figure 1b). The 
last term in eqn [1], (1/2)@(xo) x (x — xo), corre- 
sponds to a rotation: near a point with (xp) + 0, 
the fluid rotates with angular velocity |@|/2 in a 
plane normal to the vorticity vector @. Fluid for 
which @=0 is said to be irrotational. 

A vortex line is an integral curve of the vorticity. 
For incompressible flow, V-@=V-(V x u)=0, 
which implies that vortex lines cannot end in the 


z se 
a a 


(a) (b) 
Figure 1 Strain field: (a) two positive eigenvalues, sheet 
formation and (b) one positive eigenvalue, tube formation. 


interior of the flow, but must either form a closed loop 
or start and end at a bounding surface. In 2D flow, 
u=ui-+ vj and the vorticity is @ = wk, where w is the 
scalar vorticity. Thus, in 2D, the vorticity points in the 
z-direction and the vortex lines are straight lines 
normal to the x-y plane. A vortex tube is a bundle 
of vortex lines. The strength of a vortex tube is defined 
as the circulation feu - ds about a curve C enclosing 
the tube. By Stokes’ theorem, 


] a ff nas [3] 


and thus the circulation can also be interpreted as 
the flux of vorticity through a cross section of the 
tube. In inviscid incompressible flow of constant 
density, Helmholtz” theorem states that the tube 
strength is independent of the curve C, and is 
therefore a well-defined quantity, and Kelvin’s 
theorem states that a tube’s strength remains 
constant in time. A vortex filament is an idealization 
in which a tube is represented by a single vortex line 
of nonzero strength. 

The evolution equation for the fluid vorticity, as 
derived from the Navier-Stokes equations, is 

em 0 Da tab [4] 
dt 

where d/di =0/0t +u-V is the total time deriva- 
tive. Equation [4] states that the vorticity is 
transported by the fluid velocity (first term), 
stretched by the fluid velocity gradient (second 
term), and diffused by viscosity v (last term). These 
equations are usually nondimensionalized and writ- 
ten in terms of the Reynolds number, a dimension- 
less quantity inversely proportional to viscosity. 

To understand high Reynolds number flow it is of 
interest to study the inviscid Euler equations. The 
corresponding vorticity evolution equation in 2D is 


dw 


which states that 2D vortex filaments in inviscid 
flow move with the fluid velocity. Furthermore, in 
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incompressible flow, the fluid velocity is determined 
by the vorticity, up to an irrotational far-field 
component u, through the Biot-Savart law, 


1 f(x—x’') x a(x’) 


— E dx + Msg (3 


u(x) = 
| kp 


In planar 2D flow, eqn [6] reduces to 


1 —yitxj 


K*p(x) = p 


e 


u(x) = K2pw, 
where w(x) is the scalar vorticity. Equations [4], [5] 
and [6], [7] are the basis of the numerical methods 
discussed later in this article. 

A vortex is typically defined by a region in the 
fluid of concentrated vorticity. A simple model is a 
point vortex in 2D flow, which corresponds to a 
straight vortex filament of unit circulation. The 
associated scalar vorticity is a 6-function in the 
plane, and the induced velocity is obtained from the 
Biot-Savart law. For a point vortex at the origin, 
this reduces to the radial velocity field 
u(x) = K*5pó = K;p(x). Corresponding particle tra- 
jectories are shown in Figure 2a. The particle speed 
|u| — 1/(2xr) increases unboundedly as the vortex 
center is approached, and vanishes as r — oo 
(Figure 2b). In general, the far-field velocity of a 
concentrated vortex behaves similarly to the one of 
a point vortex, with speeds decaying as 1/r. Near 
the vortex center, the velocity typically increases in 
magnitude and, as a result, the fluid pressure 
decreases (Bernoulli's theorem). A vortex of arbi- 
trary shape can be approximated by a sum of point 
vortices (in 2D) or vortex filaments (in 3D), as is 
often done for simulation purposes. 

Vorticity can be generated by a variety of 
mechanisms. For example, vorticity can be gen- 
erated by density gradients, which in turn are 
induced by spatial temperature variations. This 
mechanism explains the formation of warm-air 
vortices when a layer of hot air is trapped 
underneath cooler air. Vorticity is also generated 
near solid walls in the form of boundary layers 
caused by viscosity. To illustrate, imagine 


Speed, |u| 


Distance, r 


(a) (b) 
Figure 2 Flow induced by a point vortex: (a) streamlines and 
(b) speed |u| vs. distance r. 
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(a) (b) 
Figure 3 Velocity and vorticity in boundary layer near a flat 
wall. 


horizontal flow with speed U, moving past a 
solid wall at rest (Figure 3a). Since in viscous 
flow the fluid sticks to the wall (the no-slip 
boundary condition), the fluid velocity at the wall 
is zero. As a result, there is a thin layer near the 
wall in which the horizontal velocity varies 
greatly while the vertical velocity gradients are 
small, yielding large negative vorticity values 
w=Vy— uy (Figure 3b). Similarity solutions to 
the approximating Prandtl boundary-layer equa- 
tions show that the boundary-layer thickness d 
grows proportional to yt, where t measures the 
time from the beginning of the motion. Boundary 
layers can separate from the wall at corners or 
regions of high curvature and move into the fluid 
interior, as illustrated in several of the following 
examples. 


Sample Vortex Flows 
Shear Layers 


A shear layer is a thin region of concentrated 
vorticity across which the tangential velocity com- 
ponent varies greatly. An example is the constant- 
vorticity layer given by parallel 2D flow 
u(x, y) = U(y), v(x, y) =0, where U is as shown in 
Figure 4a. In this case, the velocity is constant 
outside the layer and linear inside. The vorticity 
w= —U'(y) is zero outside the layer and constant 


2D 


0.65 kd 


(a) (b) 
Figure 4 Shear layer: (a) velocity profile and (b) dispersion 
relation. 


g^. 


(a) 


Figure 5 Vortex sheet: (a) ".. profile and (b) dispersion 
relation. 


inside. Shear layers occur naturally in the ocean or 
atmosphere when regions of distinct temperature or 
density meet. To illustrate this scenario, consider a 
tank containing two horizontal layers of fluids of 
different densities, one on top of the other. If the 
tank is tilted, the heavier bottom fluid moves 
downstream, and the lighter one moves upstream, 
creating a shear layer. 

Flat shear layers are unstable to perturbations: 
they do not remain flat but roll up into a sequence 
of vortices. This is the Kelvin-Helmholtz instability, 
which can be deduced analytically using linear 
stability analysis. One shows that in a periodically 
perturbed flat shear layer, the amplitude of a 
perturbation with wave number k will initially 
grow exponentially in time as e“’, where w — w(Kk) 
is the dispersion relation, leading to instability. The 
wave number of largest growth depends on the layer 
thickness. This is illustrated in Figure 4b, which 
plots w(k) for a constant-vorticity layer of thickness 
2d. The wave number of maximal growth is 
proportional to 1/d. 

A vortex sheet is a model for a shear layer. The 
layer is approximated by a surface of zero thickness 
across which the tangential velocity is discontinu- 
ous, as illustrated in Figure 5a. In this case, the 
dispersion relation reduces to w(k) =+ k. That is, 
for each wave number & there is a growing and a 
decaying mode, and the growing mode grows faster 
the higher the wave number is, as shown in 
Figure 5b. The vortex sheet arises from a constant 
vorticity shear layer as the thickness d — 0 and the 
vorticity w — oo in such a way that the product wd 
remains constant. Figure 6 shows the roll-up of a 
periodically perturbed vortex sheet due to the 
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Figure 6 Computation of vortex sheet roll-up. 


Kelvin—Helmholtz instability, computed using one of 
the methods described in the next section. 


Aircraft Trailing Vortices 


One can often observe trailing vortices that shed 
from the wings of a flying aircraft (also called 
contrails). These vortices are formed because the 
wing develops lift. The pressure on the top of the 
wing is lower than on bottom, causing air to move 
around the edge of the wing from the bottom 
surface to the top. The boundary layer on the wing 
separates as a shear layer that rolls up into a vortex 
attached to the tip of the wing (Figure 7). Since the 
velocity inside the vortex is high, the pressure is 
correspondingly low and causes water vapor in the 
air to condense, forming water droplets that 
visualize the vortices. The vortex strength increases 
with increasing lift, and is particularly strong in 
high-lift conditions such as take-off and landing. 
Since lift is proportional to weight, it also increases 
with the size of the airplane. Vortices of large planes 
are strong enough to flip a small one if it gets too 
close. Trailing vortices are the principal reason for 
the time delay between take-off and landing and are 
still a serious issue for crowded urban airports. 

The trailing vortices can be modeled by a pair of 
counter-rotating vortex lines (Figure 8a). Two 
parallel vortex lines of opposite strength induce a 
downward motion on each other, similar to two 
point vortices, the zero-core limit. Two point 
vortices of strength +T at a distance 2d from each 
other translate with self-induced velocity (Figure 9): 

p 
im 4rd 8 
As a result trailing vortices near takeoff hit the 
ground as a strong downwash air current. 

Vortex decay results generally from the develop- 
ment of instabilities. Two parallel vortex tubes are 
subject to the long-wavelength Crow instability. 
Triggered by turbulence in the surrounding air, or 
by local variations in air temperature or density, the 
vortices develop symmetric sinusoidal perturbations 
with long wavelength, of the same order as the 
vortex separation (Figure 8b). As the perturbations 
grow to finite amplitude, the tubes reconnect and 
produce a sequence of vortex rings. Note that the 


Figure 7 Sketch. Shear layer separation and roll-up into 
trailing vortices behind an airfoil. 
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(b) 


(c) 
Figure 8 Sketch. Onset of Crow instability in a pair of vortex 
lines and ensuing reconnection. 


two-dimensional schematic in Figure 8c does not 
convey the three-dimensional structure of the rings. 
The reconnection process destroys the initial wake 
structure more rapidly than viscous decay of the 
individual filaments. 

Of much interest is the study of how to accelerate 
the vortex decay. High-aspect-ratio vortices are subject 
to a shorter-wavelength elliptic instability, which leads 
to earlier destruction. However, such vortices are not 
realistic in current aircraft wakes. Wing designs have 
been proposed in which more than two trailing 
vortices form which interact strongly and lead to 
faster decay. Other interesting aspects are the effect of 
ambient turbulence and vortex breakdown. Break- 
down refers to a disturbance in the vortex core in 
which it quickly, within an axial distance of few core 
diameters, develops a region of reversed flow and loses 
its laminar behavior. 

Unlike the counter-rotating vortices discussed so 
far, two equally signed vortices rotate under their 
self-induced velocity about a common axis. If the 
separation distance between them is too small, two 
equally signed patches merge into one. Vortex 
merging occurs in two- or three-dimensional flows, 
as opposed to vortex reconnection, which is a 
strictly three-dimensional phenomenon. 


t 


Figure 9 Self-induced downward motion of a vortex pair. 
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Vortex Rings 


A vortex tube that forms a closed loop is called a 
vortex ring. Vortex rings can be formed by ejecting 
fluid from a circular opening, such as when a smoke 
ring is formed. The boundary layer wall vorticity 
separates at the opening as a cylindrical shear layer 
that rolls up at its edge into a ring (Figure 10). The 
vorticity is concentrated in a core, which may be 
thin or thick relative to the ring diameter. The 
limiting cases are an infinitely thin circular filament 
of nonzero circulation and the Hill’s vortex, in 
which the vorticity occupies all the interior of a 
sphere. 

Just as a counter-rotating vortex pair, a ring 
translates under its self-induced velocity U in 
direction normal to the plane of the ring (Figure 11). 
However, unlike the vortex pair, the ring velocity 
depends significantly on its core thickness. For a ring 
with radius, circulation and core size, respectively, 
R, T, a, the self-induced velocity is 


T SR 1 
L —<) P 


asymptotically as a — 0. Thus, the translation 
velocity becomes unbounded for rings with decreas- 
ing core size. In reality, at some point viscosity takes 
over and spreads the core vorticity, slowing the ring 
down. 


N 


Figure 10 Vortex ring, formed by ejecting fluid from a circular 
tube. 
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Figure 11 Self-induced motion of a vortex ring. 


Figure 12 Sketch. Onset of azimuthal vortex ring instability. 


Vortex rings of small cross section are subject 
to an azimuthal instability. Theory, experiment, 
and simulations show that if a ring is perturbed 
in the azimuthal direction, there exists a domi- 
nant wave number which is unstable and grows 
(Figure 12). The unstable wave number increases 
as the core size decreases, while its spatial 
amplification rate is almost independent of the 
core size. 

Interesting dynamics are obtained when two or 
more rings interact. Two coaxial vortex rings of 
equally signed circulation move in the same 
direction and exhibit leap-frogging: the rear ring 
causes the front ring to grow in radius and the 
front ring causes the rear one to decrease. From 
eqn [9] it can be seen that the ring velocity is 
inversely proportional to its radius. Consequently, 
the front ring slows down and the rear ring 
speeds up, until the rear ring travels through the 
front ring. This process repeats itself and is 
known as leap-frogging. On the other hand, two 
coaxial vortex rings of oppositely signed circula- 
tion approach each other and grow in radius. 
Their cores contract in order to preserve volume, 
and their vorticity increases in order to preserve 
circulation. Under certain experimental condi- 
tions, the azimuthal instability develops, the 
resulting waves on opposite rings reconnect and 
a sequence of smaller rings form. 


Vortices, Mixing, and Chaos 


Mixing is important in many natural processes and 
technological applications. For example, mixing in 
shear flows and wakes is relevant to aeronautics and 
combustion, mixing and diffusion determine chemi- 
cal reaction rates, and mixing of contaminants 
pollutes oceans and atmosphere. It is therefore 
important to understand and control mixing 
processes. 

Efficient mixing of two fluids is obtained by 
efficient stretching and folding of material lines. 


Stretching and folding in turn are the fingerprint 
of chaos; thus, mixing and chaos are intimately 
related. Mixing and associated chaotic fluid 
motion can be obtained by simple vortical 
motion. For example, two counter-rotating vor- 
tices subject to a periodic strain field oscillate in a 
regular fashion but induce chaos in a region of 
fluid moving with them. Similarly, two corotating 
vortices of equal strength that are turned on and 
off periodically so that one is on when the other 
is off, known as the blinking vortices, rotate 
around a common axis in a stepwise manner but 
induce chaos in nearby regions. On the other 
hand, if there are four or more vortices present, 
the vortex motion itself is generally chaotic. It 
should be noted that there are also nonchaotic 
equilibrium solutions of four or more vortices 
forming what is called a vortex crystal. 

Information about chaotic particle motion is 
obtained by studying Poincaré sections, examining 
the associated stable and unstable manifolds, and 
investigating the existence of chaotic maps such as 
the horseshoe map. 


Atmospheric Vortices 


Atmospheric vortices are driven by temperature 
gradients, Earth’s rotation (Coriolis force), spatial 
landscape variations, and instabilities. For example, 
temperature differences between the equator and the 
poles and Earth's rotation lead to large-scale 
vortices such as the trade winds (Hadley cell), the 
Jet streams, and the polar vortex (Figure 13). Semi- 
annual temperature oscillations are responsible for 
the Indian monsoons. Daily oscillations cause land- 
and sea-breezes. Landscape variations can cause 
urban-rural wind flows and  mountain-valley 
circulations. 

Instabilities are often responsible for large 
cyclonic vortices. Barotropic instability results 
from large horizontal velocity gradients, and has 
been deemed responsible for disturbances over the 
Sahara region that occasionally intensify into 
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Figure 13 Vortices in the atmosphere. 
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tropical cyclones. Baroclinic instability, which 
occurs when temperature advection is superposed 
on a velocity field, can lead to cyclonic vortices at 
the front between air of polar origin and that of 
tropical origin. The inertial or centrifugal 
instability occurs when air flows around high- 
pressure systems and the pressure gradient force 
is not large enough to balance the centripetal 
acceleration and the Coriolis effect. 

Vortices also form on other planets with an 
atmosphere. On Mars, dust devils are quite 
common. They are —10—50 times larger than the 
ones on Earth and can carry high-voltage electric 
fields caused by the rubbing of dust grains against 
each other. Jupiter’s characteristic spots are 
extremely large storm vortices. The Great Red 
Spot is a vortex spanning twice the diameter of 
the Earth. Unlike the low-pressure terrestrial 
storms and hurricanes, the Great Red Spot is a 
high-pressure system that has been stable for 
more than 300 years. Other vortices on Jupiter 
decay and vanish, such as the White Ovals, three 
large anticyclones which merged into one within 
two years. Recent computer simulations predict 
that many of Jupiter's vortices will merge and 
disappear in the next decade. As a result, mixing 
of heat across zones will decay and the planet's 
temperature is predicted to increase. 

Numerical simulations of the atmosphere are 
expensive due to the large number of parameters 
and the relatively small scales that need to be 
resolved. For climate models and medium-range 
forecast models, the governing 3D compressible 
Euler equations are simplified using the hydro- 
static approximation (in which only the pressure 
gradient and the gravitational forces are retained 
in the vertical-momentum equation) and the 
anelastic approximation (in which dp/dt is 
neglected), to obtain the primitive equations. 
Additional vertical averaging yields the shallow- 
water equations. One big hurdle is to accurately 
incorporate the effect of clouds, which is sig- 
nificant and is usually treated using subgrid 
models. 


Vortices in Superfluids and Superconductors 


At temperatures below 2.2K, liquid helium is a 
superfluid, meaning that it acts essentially like a 
fluid with zero viscosity governed by the Euler 
equations. The fluid is irrotational, except for 
extremely thin vortex filaments, which are formed 
by quantum-mechanical processes. Since the vortices 
cannot end in the interior of the flow, they can be 
generated only at the surface or they nucleate as 
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vortex rings inside the fluid. As an example, if 
a cylindrical container with helium is rotated 
sufficiently fast, vortex lines attached to both ends 
of the container appear. These quantum vortices 
have discrete values of circulation (— nb/m, where 
bh =Planck’s constant, m=mass of helium atom, 
n= integer), core sizes of about 1A (roughly the 
diameter of a single hydrogen atom) and move 
without viscosity. 

Similarly, certain types of materials lose their 
electric resistance at low temperatures and 
become superconductors. One distinguishes type-I 
superconductors (most pure metals) from type-II 
superconductors (alloys). Using the Ginzburg- 
Landau theory it has been predicted that in 
type-II superconductors a lattice of vortex fila- 
ments forms, each carrying a quantized amount 
of magnetic flux. This was subsequently con- 
firmed by experimental observation. More pre- 
cisely, for temperatures T below a critical value 
T., there are three regions corresponding to 
increasing values of the magnetic field (Figure 14). 
At low magnetic fields (H < Ha), no vortices 
exist (superconducting phase). At intermediate 
values (Ha < H < Ha), the magnetic field pene- 
trates the superconductor in the form of quan- 
tized vortices, also called flux lines (mixed 
phase). The values H.¡ ¿, are determined by the 
London penetration depth A, which measures the 
electromagnetic response of the superconductor. 
With increasing magnetic field, the density of flux 
lines increases until the vortex cores overlap 
when the upper critical field H2 is reached, 
beyond which one recovers the normal metallic 
state (normal conductor). 

When an external current density 7 is applied to 
the vortex system, the flux lines start to move under 
the action of the Lorentz force. As a result, a 
dissipating electric field E appears that is parallel to 
j, and the superconducting property of dissipation- 
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Figure 14 Superconductor phase dependence on magnetic 
field H and temperature T. 


free current flow is lost. In order to recover the 
desired property of dissipation-free flow, flux lines 
have to be pinned, for example, by introducing 
inhomogeneities and structural defects. For a given 
pinning force, flux lines remain pinned as long as the 
current density stays below a critical value. A major 
research objective is to optimize the pinning force in 
order to preserve superconductivity at larger current 
densities. 


Numerical Vortex Methods 


Many numerical methods used to compute fluid 
flow are Eulerian schemes based on a fixed mesh, 
such as finite difference, finite element, and spectral 
methods, commonly used for example in atmo- 
sphere and ocean modeling. This section briefly 
describes alternative  vorticity-tracking methods 
used to simulate incompressible inviscid vortex 
flows, and concludes with some extensions to 
viscous flows. The premise of these methods is 
that since the fluid velocity is determined by the 
vorticity through the Biot-Savart law (eqn [6]), it 
suffices to track only that portion of the fluid 
carrying nonzero vorticity. This region is often 
much smaller than the total fluid volume, and 
computational efficiency is gained. Numerical vor- 
tex methods are typically Lagrangian, that is, the 
computational elements move with the fluid 
velocity. 


Point-Vortex Approximation in 2D 


To compute the evolution of a vorticity distribution 
w(x,t) in 2D, the simplest approach is to approx- 
imate the vorticity by a set of point vortices at x;(t) 
with circulation T; and evolve them under their self- 
induced motion. The values I; are an estimate of the 
initial circulation around x;(0). The vortex positions 
x;(t) evolve in the induced velocity field 


dx; N 
E S Ti Kop(x; — xx) [10] 


where the exclusion k Æ j accounts for the fact that 
a point vortex induces zero velocity on itself. The 
solution to the system of ordinary differential 
equations [10] can be obtained using any method, 
such as Runge-Kutta or Adams-Bashforth. 
The point-vortex approximation can be written in 
Hamiltonian form as 
dx; u 10H 


de — T; Oy; * 


dy 18H 
dt . TjÓx; 


where the Hamiltonian 


+O- ye) | [12 


is conserved along fluid particles, dH /dt —0. The 
method also conserves the fluid circulation and the 
linear and angular momenta. 

Ideally, the solution to [10] should converge as 
N — oo to the solution of the Euler equations. 
This is true for smooth vorticity distributions, but 
for singular distributions such as a vortex sheet, 
the situation is more complicated. The vortex 
sheet, a curve in the plane, develops a singularity 
in finite time at which the curvature becomes 
unbounded at a point. The point-vortex approx- 
imation converges before the singularity formation 
time, provided the growth of spurious roundoff 
error due to Kelvin-Helmholtz instability is 
suppressed using a filter. However, past the 
singularity formation time, the  point-vortex 
approximation no longer converges. 

The general approach is to replace the singular 
kernel Kop by a regularization K5,,, such as 


= [13a] 
x 


Qon 
Ke = (1-H | [13b] 


7 


where 6 is a numerical parameter. The regulariza- 
tion amounts to replacing the 6-function vorticity 
of a point vortex by an approximate 6-function. In 
order to recover the solution to the Euler equations, 
it is necessary to study the limit N — 00,0 — 0. For 
smooth vorticity distributions, this process con- 
verges. For vortex sheet initial data, there is 
evidence of convergence, but details of the limiting 
behavior remain under investigation. Regularized 
solutions with fixed value 6 and vortex sheet initial 
data are shown in Figures 6 and 15. Figure 6 shows 
the onset of the Kelvin-Helmholtz instability in a 
periodically perturbed flat vortex sheet. Figure 15 
shows the rollup of an elliptically loaded flat vortex 
sheet that models the evolution of an aircraft 
wake (see Figure 7). The correspondence between 
the two-dimensional simulation and the three- 
dimensional wake is made by replacing the spatial 
coordinate in the aircraft's line of flight by a time 
coordinate. 
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Figure 15 Computed evolution of an elliptically loaded flat 
vortex sheet. 


Contour Dynamics in 2D 


Consider a planar patch of constant vorticity wo 
bounded by a curve x(s,t),0 € s € L, moving in 
inviscid, incompressible flow. In view of Kelvin's 
theorem and eqn [5], the vorticity in the patch 
remains constant and equal to wọ for all time, and 
the patch area remains constant. Only the patch 
boundary moves. The velocity at a point x(a,t) on 
the boundary can be written as a line integral over 
the boundary: 


Ox 
3s ds [14] 


s--zfi lx — x(s, £) 
dt — 2m cs 


The contour dynamics method consists of approx- 
imating a given vorticity distribution by a super- 
position of vortex patches, and moving their 
boundaries according to eqn [14]. This method 
has been applied to compute the evolution of 
single-vortex patches and shear layers, and to 
geophysical flows. Typically, filamentation occurs: 
the patch develops thin filaments which increase the 
boundary length significantly and thereby the 
computational expense. The approach generally 
taken is to remove the thin filaments at several 
times throughout the computation, which is 
referred to as contour surgery. The contour 
dynamics approach as well as the point-vortex 
approximation have also been generalized to treat 
quasigeostrophic flows. 
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Vortex Filament Methods in 3D 


Vortex simulations in 3D differ from those in 2D in 
that the stretching term in eqn [4] needs to be 
incorporated. The vortex filament method approx- 
imates the fluid vorticity by a finite number of 
filaments whose circulation remains constant in 
time. Each filament is marked by computational 
mesh points which move with the regularized 
induced velocity. The regularization is necessary to 
prevent the infinite self-induced velocities of curved 
vortex filaments. As in 2D, this method automati- 
cally conserves circulation. Vorticity stretching is 
accounted for by the stretching between computa- 
tional mesh points. As the filament length increases, 
more meshpoints are typically introduced to keep it 
resolved. Also, the number of filaments can be 
increased throughout the simulation to maintain 
resolution. 


Viscous Vortex Methods 


While inviscid models are expected to approximate 
small viscosity fluids well far from boundaries, near 
boundaries, where vortex shedding is an inherently 
viscous mechanism, it is important to incorporate 
the effects of viscosity. The first methods to do so 
used operator splitting in which inviscid and viscous 
terms of the Navier-Stokes equations were solved in 
a sequential manner. In each time step, the compu- 
tational elements would first be convected, and then 
they would be diffused by a random-walk scheme. 
The particle strength exchange method, introduced 
more recently, does not rely on operator splitting 
and has better accuracy. The particle position and 
vorticity evolve simultaneously, and viscous 
diffusion is accounted for in a consistent manner. 

Vortex dynamics continues to be a source of 
interesting problems of theoretical and practical 
importance. In particular, much remains to be 
learned to better understand turbulence and the 
transition to turbulence, a process dominated by 
deterministic vortex dynamics. 


Further Remarks 


Finally, some remarks on relevant literature on this 
subject are in order. Lugt (1983) and Tritton (1988) 
are recommended as elementary introduction to 
vortex flows. van Dyke (1982) presents beautiful and 
instructive flow visualizations. Comprehensive treat- 
ments of incompressible fluid dynamics are given in 
Batchelor (1967), Chorin and Marsden (1992), Lamb 
(1932), and Saffman (1992), and compressible flow is 
treated in Anderson (1990). Cottet and Koumoutsakos 
(2000) give an overview of numerical vortex methods. 


Special topics have also been addressed; atmosphere 
(Andrews et al. 1987), point vortex motion and chaos 
(Aref 1983, Newton 2001, Ottino 1989), superfluids 
and superconductors (Blatter et al. 1994, Donnelly 
1991), turbulence theory using statistical. mechanics 
(Chorin 1994), vortex reconnection (Kida and 
Takaoka 1994), theory for Euler and Navier-Stokes 
equations (Majda and Bertozzi 2002), contour 
dynamics (Pullin 1992), vortex rings (Shariff and 
Leonard 1992), and aircraft trailing vortices (Spalart 
1998). Green (1995) includes survey articles on 
various topics. 


Nomenclature 

a vortex ring core size 
g gravitational field 
H Hamiltonian 


K»p singular velocity kernel 


K»p.; regularized velocity kernel 
p(x, t) fluid density 

R vortex ring radius 

T(x, t) temperature 

U translation velocity 


u(x,t) =u(x, t)i+ 
v(x, t)j + w(x, t)k 


fluid velocity 


w(k) dispersion relation 
o=V xu vorticity 

W= Vy — My scalar vorticity 

r ring circulation 


See also: Abelian Higgs Vortices; Incompressible Euler 
Equations: Mathematical Theory; Integrable Systems: 
Overview; Interfaces and Multicomponent Fluids; 
Intermittency in Turbulence; Newtonian Fluids and 
Thermohydraulics; Point-Vortex Dynamics; Stochastic 
Hydrodynamics; Superfluids; Topological Knot Theory 
and Macroscopic Physics; Turbulence Theories. 
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Introduction 
The most basic wave equation is 


T 

for 4—u(t,x) where A is the Laplace operator, 
given by Au = 0^u/Ox1 + --- + O^ u/Ox2 on n-dimen- 
sional Euclidean space R”. More generally, u might 
be defined on R x M, where R is the z-axis and M is 
a Riemannian manifold, with a metric tensor given 
in local coordinates by (gj). Then the Laplace- 
Beltrami operator is given, in local coordinates, by 


) : 4, Ou 
ES — ¿A LS La M m 2 
ue" Da (eee) a 


where (g/*) is the matrix inverse to (ga) and 
g= det (g). Even if one concentrates on wave 
propagation in Euclidean space, one frequently 
wants to use curvilinear coordinates, and [2] is 
useful. Equation [1] is supplemented by initial 
conditions of the form 


u(0, x) = f(x), 


called Cauchy data. If the spatial domain M has a 
boundary OM (e.g., if M is a bounded region in R"), 
then boundary conditions are imposed. The most 
common are the Dirichlet boundary condition 


O,u(0,x) = g(x) [3] 


u(t,x)=0 forx€ 0M [4| 
and the Neumann boundary condition 
O,u(t,x) =O forx€ 0M [5] 


where „u denotes the normal derivative of u at the 
boundary. More generally, one might have a driving 
force, and replace 0 on the right-hand side of [1] by 
a function F(t,x). Similarly, one can consider 
nonzero boundary data in [4] and [5]. 


The wave equation [1] models a number of 
physical phenomena, at least in the linear approxi- 
mation. The vibration of a drum head is modeled by 
[1], with M a planar domain, and with the Dirichlet 
boundary condition [4]. The motion of sound waves 
in a room with hard walls is modeled by [1], with M 
a region in R?, and with the Neumann boundary 
condition [5]. The propagation of electromagnetic 
waves is given by Maxwell's equations: 


^r curl B = —] 

OB 

2: + curl E =0 [6] 
divE =p 
divB=0 


where p is the electric charge density and J the 
current. These equations yield [1] (with the right- 
hand side replaced by some function F(t, x) if J and p 
are not zero) for the components of the electric field 
E and the magnetic field B. If the propagation is in a 
region M in R? bounded by a perfect conductor, 
then the boundary conditions are that E is normal to 
OM and B is tangential to OM. If OM is flat, these 
equations can be decomposed into Dirichlet pro- 
blems for some components and Neumann problems 
for the rest, but if OM is curved such a decomposi- 
tion is not possible. 

Other models of vibrating objects produce var- 
iants of [1]. Examples include vibrating elastic 
solids, yielding an equation like [1] with Az 
replaced by pAu+(A+yp)graddivu, for linear 
elasticity. Here A acts componentwise on u, and yu 
and A are constants, called Lamé constants. Other 
examples model vibrations of crystals and propaga- 
tion of electromagnetic waves in crystals. Further 
interesting phenomena arise in these various cases, 
such as Rayleigh waves in linear elasticity and 
conical refraction in crystal optics. 

Here we discuss the propagation of waves and their 
reflection and diffraction at boundaries. In the interest 
of providing reasonable coverage in a brief space, we 
restrict attention to the wave equation [1]. 
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Basic Propagation Phenomena 


The simplest examples of waves propagating accord- 
ing to [1] are plane waves, of the form 


u(t,x) = plo -w—t) [7] 


with (t,x) € R x R”,w a unit vector in R”, and pa 
function on R. If has two continuous derivatives, 
[7] defines a classical solution of [1]. More 
generally, one can allow vy to be less regular. For 
example, it could be piecewise smooth with a jump 
discontinuity at some point a € R. In such a case, u 
will be piecewise smooth with a jump across the 
n-dimensional surface x : w —t—a in R x R”, which 
will solve [1] in a weak, or distributional, sense. For 
fixed t,u(t,-) has a jump across the (n-— 1)- 
dimensional surface X, —(x € R":x-w=t-+a}. As 
t varies, X, moves in the direction w with unit speed. 

There are also spherical wave solutions to [1] on 
R x R", such as 


sent = 
u(t,x) = (e — (7 8] 
for n=2, and 
1. 
u(t,x) = 7 — (|x| — Itl) [9] 


for n=3. Here s, =s for s > 0,5, =0 for s € 0, and 
6(s) is the Dirac delta function. In fact, [8] and [9] 
are “fundamental solutions” (more on which in the 
section on harmonic analysis) to the wave equation 
on R x R”, for n=2 and 3, respectively. In such 
cases, the singularity in u(t,- ) for each fixed ż lies in 
y= {x € R” :|x|=]t|}, a family of surfaces in R” 
that moves, in the direction of the normal to X, at 
unit speed. 

The examples mentioned above illustrate two 
general phenomena about the behavior of solutions 
to [1]. The first is finite propagation speed. Its 
general formulation is that, given a closed set 


KCM, 


suppf,g CK => suppux(t,-) 
C {x € M : dist(x, K) < |t|} 110] 


In fact, given that [8]-[9] are fundamental solutions, 
[10] is a consequence of these formulas when 
M=R? or R?. The result [10] is true in great 
generality, with well-known demonstrations invol- 
ving energy estimates. 

The second phenomenon involves propagation of 
singularities. Typically, if the Cauchy data f and g in 
[3] are smooth on the complement of an (n — 1)- 
dimensional surface Xp, perhaps with a jump across 
3o, or such a singularity as in [8] or [9], the solution 
u(t,x) will be a sum of two terms, with singularities 


of a similar nature on the surfaces X7, moving at 
unit speed in the direction of their normals, X7 
flowing from Xo in one direction and X; in the 
other. This also holds for the manifold case [2]. That 
happens at least until such surfaces develop singula- 
rities, when matters become more elaborate. 

An alternative way to describe how the set of 
singularities evolves is the following. Let $4 M denote 
the space of unit vectors tangent to M; this is 
a submanifold of the tangent bundle of M, TM. 
There is a natural projection 7:S¡M — M. Asso- 
ciated to a smooth surface X of dimension n — 1 in 
M (of dimension n) are two preimages Aj and A, in 
$1M, consisting of unit vectors lying over points of 
Xo and normal to X. The geodesic flow is a flow on 
$1M, and it takes Aj to smooth (z — 1)-dimensional 
surfaces AF in S¡M. The sets X7 are the images of 
AF under m. The geodesics starting out at points in 
Aj and sweeping out AF are the rays along which 
the singularities of the solution u propagate. 

This latter description works for all t if M has no 
boundary and is complete, that is, all geodesics are 
defined for all ?, although singularities develop in 
the images (A7) — X7, at points p € UF, where AF 
meets T, M nontransversally. The behavior of u near 
such singular points of X7, known as caustics, is 
more complicated than that near regular points, but 
it can be captured in terms of integrals. Methods of 
establishing this propagation of singularities are 
discussed in the section on geometrical optics. 

Such a description needs further elaboration if M 
has a boundary. One of the principal problems of 
diffraction theory is to explain how singularities of 
solutions to [1], with a boundary condition such as 
[4] or [5], propagate and reflect off the boundary. 

Considering the case where M is a half-space 
in R", 


MeRL—ixc€R':xc0] [11] 
provides a guide to the simplest reflection phenom- 
ena. In such a case, one can solve the Dirichlet or 
Neumann boundary problem for the wave equation 
[1] by the method of images. One extends f and g 
from R^ to R”. For the Dirichlet problem [4], one 
takes odd extensions, f(x’, —x,)— —f(x',x,), and 
similarly for g. For the Neumann problem [5], one 
takes even extensions, f(x”, —x,)—f(x',x,), etc. 
One then solves the wave equation [1] on R x R" 
with the extended Cauchy data, and the restriction 
to R x R5 solves the respective boundary problem. 
Suppose No is a smooth (n — 1)-dimensional surface 
that does not meet OR”, and that f and g have 
singularities on X, as above. (Suppose for simplicity 
that f and g vanish near OR ^.) Those rays issuing 


from normals to X have mirror images, which are 
rays in R”. If such a ray hits OR”, its mirror image 
does so also, and continues into R5, as the reflected 
ray. The singularities of u propagate along such 
reflected rays. 

Such a description extends to a general complete 
Riemannian manifold with boundary M, in the case 
of rays that hit the boundary transversally. Such a 
ray is reflected by retaining the tangential compo- 
nent of its velocity vector at the point of intersection 
OM and reversing the sign of the normal component. 
One says that the ray is reflected according to the 
laws of geometrical optics. Singularities of u carried 
by such rays that hit OM are correspondingly 
reflected. Methods to establish such transversal 
reflection of singularities are natural extensions of 
those developed to treat the propagation away from 
OM, mentioned above. 

Matters become more delicate when there are rays 
that are tangent to OM. A model example is given by 


M=R"\B, B-(xcR":|x|«1) [12] 


which one takes when studying the scattering of 
waves in R" by the obstacle B. Consider a solution 
to [1] with boundary condition given by [4] or [5] 
that has a simple singularity on X, = (x € R":x, — tj 
for t< —1. The associated rays are of the form 
e(t) — (x', t), for t < 1, with x’ € R+. If la] > 1, 
these rays continue on in R"VB, for all t > —1. If 
Ix'| < 1, these rays hit OM=0OB transversally, and 
their reflection is as described above. If |x'|— 1, 
these rays hit OB tangentially, at t=0; they are 
sometimes called grazing rays. One also continues 
them past t=0. One defines in this fashion X, for 
t > —1. The region 


S= {x= (x',%,) ERAB: |x| < 1,%, >0P [13] 


is called the “shadow region." It is disjoint from X; 
for all t. The solution u is smooth in S for all t, 
although it is not identically zero. The set 


S = (x = (x',x,) e R"\B: lx] 21,x, > 0} [14] 


is the “shadow boundary." 

One can replace B in [12] by a more general 
smooth, convex obstacle K, with positive Gauss 
curvature everywhere, and the same considerations 
of transversal and grazing rays and shadow regions 
apply. These notions also extend to a more general 
class of Riemannian manifolds with boundary, 
called manifolds with diffractive boundary. In the 
case K=B, one can use separation of variables to 
reduce the problem of analyzing solutions to [1] and 
showing that singularities propagate along such rays 
to a problem in harmonic analysis on the sphere 
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$"-!. For more general convex obstacles K or 
manifolds with diffractive boundary, other techni- 
ques are required, to show that waves reflect off the 
boundary in a fashion similar to the case |12]. 

Another situation arises if instead of [12] one 
takes M=B, or more generally M=K, a convex 
region as described above. A ray starting off from 
a point in OM, almost tangent to OM but with a 
small component in the direction of the normal 
pointing into M, will undergo many reflections in 
a short time. Upon shrinking the normal compo- 
nent of the initial velocity to zero, one obtains in 
the limit a geodesic in OM, known as a gliding ray. 
In such a case, singularities of solutions to [1], 
with such a boundary condition as [4] or [5], 
propagate along both transversally reflected and 
gliding rays. 

For the generic smooth obstacle K in R", the 
second fundamental form can have a variety of 
signatures at various boundary points. Various types 
of “generalized rays" occur — generally speaking 
limits of sequences of transversally reflected rays. 
This situation. also holds for general complete 
Riemannian manifolds with smooth boundary. The 
main result about propagation of singularities in 
such a case is that it is always along such generalized 
rays. This was established by Melrose and Sjóstrand 
(1978). 

Further diffraction effects arise when OM has 
singularities, such as edges and corners. The simplest 
example is 


M=(xER*:a<0<b,r>0) [15] 


where (7, 0) are the polar coordinates of x € R*, and 
we assume 0 «€ a < b < 27. Here one is studying the 
diffraction of waves by a wedge. In the limiting case 
à —0, b — 25, the wedge becomes a half-line, that is, 


M = R? \ {(x1,0) : x1 > 0) [16] 


Singularities of solutions to [1] on R x M with 
such a boundary condition as [4] or [5] propagate in 
the interior of M and reflect off the regular points of 
OM as before. If a family of continuous, piecewise 
smooth curves X, carrying the singularity of u hit the 
corner x =0 at £ — a, this reflection creates a tear in 
Y, for t > a. In addition, a diffracted wave spreads 
out from the corner at unit speed. This diffracted 
wave carries a singularity that is weaker than that of 
the incident wave. For example, if one has a solution 
like [8], but shifted to have support in a disk of radius 
It| about a point p 4 0 in RŽ, for small |t|, then the 
diffracted wave will have a jump discontinuity. 

The space M in [15] is a special case of a cone. 
More generally, if N is a complete Riemannian 
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manifold (possibly with boundary), then the cone 
C(N) with base N is the set 


C(N) = [0,00) x N [17] 


with all points (0,x), x € N, identified, with the 
metric tensor 


ds” = dr? 4- r^g [18] 


where g is the metric tensor on N, and points on 
C(N) are denoted (r,x),r € [0,00),x € N. The space 
in [14] has the form M=C(N) with N — [a, b], an 
interval. A cone in Euclidean space R” is of the form 
C(N) with N a domain in the unit sphere 5". 

The propagation of singularities for solutions to [1] 
on C(N), when N has smooth boundary, has a 
description similar to that above for the case [15]. 
Again, there is a diffracted wave set off from the conic 
point {r = 0] when a singularity of a wave hits it. The 
diffracted wave is typically (n — 1)/2 units smoother 
than the singular wave producing it, where 
n= dim C(N). For example, the fundamental solution 
to the wave equation on C(N) produces a diffracted 
wave which is the sum of a jump discontinuity and (in 
general) a logarithmic singularity. 

In fact, precise understanding of the behavior of 
the fundamental solution to the wave equation on 
C(N) is encoded in terms of the behavior of the 
solution operator to the wave equation on the base 
N. This is discussed in further detail in the section 
on harmonic analysis. In the case where C(N) is 
given by [15], we are dealing with the wave 
equation on an interval [a,b], whose behavior is 
elementary. 

One can use analysis of [15] together with finite 
propagation speed to get a good qualitative picture of 
diffraction of waves in R? by a polygonal obstacle. A 
variation of this argument allows one to understand 
the behavior of the wave equation on a “polygonal” 
domain N in S*, that is, one whose boundary consists 
of a finite number of geodesic segments in $?. Going 
from there to C(N), one can then analyze diffraction 
of waves in R? by a polyhedron. 

It is worth remarking how the “shadow region” 
for such an obstacle as a wedge in R? differs from 
that in [12]-[14]. For example, if one considers M 
given by [16] and z(t, x) —ó(x» — t), for t < 0, then 
the region 


G= fx = (xy 25) 2, xa >0) [19] 


is the “shadow region,” in the sense that rays either 
missing or reflecting off the obstacle ((x1,0) :x1 > 0] 
do not enter the region [19]. However, unlike the 
case [13], the solution u(t,x) is not smooth in the 
region [19] for t > 0. There is a singularity there, 


although it is weaker than the singularity of the 
main wave. 

Taking Cartesian products of spaces of the form 
[15] with R^ yields spaces with k-dimensional 
edges. There are also spaces with curvy edges. 
Rather than continuing with further general 
description, one more particular case is discussed 
next, which has had a historical significance. 
Namely, we consider the reflection of waves in R? 
off a disk, that is, take 


M=R°\D, D={(x1,x2,0): x} +23 <1} [20] 


Consider a wave given for t < 0 by u(t, x) =6(x3 — t). 
This wave hits D = 9M at t — 0, giving off a diffracted 
wave, traveling away from o={(x1,x2,0) ix? + 
x3=1) at speed 1 for t> 0. This diffracted wave 
carries a singularity that blows up like the —1/2 power 
of the distance to the torus of points of distance t from 
c, for t € (0,1). For t > 1, there is a focusing effect 
along the x3-axis, producing a stronger singularity for 
u(t, x) there. 

This sort of phenomenon was understood, at 
least from a heuristic point of view, in the 
nineteenth century, and it played a role in an 
important argument of Poisson. At the time, there 
was a debate about whether the propagation of 
light was a wave phenomenon. Poisson did not 
think it was, and he noted that if it were, the light 
waves propagated past such an obstacle should 
produce a bright spot along the axis normal to the 
disk and through its center. The experiment was 
performed and the bright spot was observed. 
This is now called the Poisson spot, and its 
occurrence convinced many physicists, including 
Poisson, that the propagation of light is a wave 
phenomenon. 


Harmonic Analysis and the Wave 
Equation 


The wave equation [1] with Cauchy data [3] can be 
regarded as an operator differential equation, with 
solution 


sin £V —A 
v —A 
This brings one to investigate functions of the self- 


adjoint operator A. If M — R", one can do this using 
the Fourier transform, which is given by 


FFE) = (€) = (21) "2 / fede 2] 


One defines F* by changing e^'** to e** in [22], 
and the Fourier inversion formula says F and F* are 


u(t,x) = cos tV —A f (x) + g(x) 21] 


inverses of each other on various function spaces, 
including L^(R"). Then one has 


e( V - A)f(x) = Ga ] e(eDf(ee*tae [23 


Note that [23] is equal to 


J (x — y)f (y) dy = 9 « f(x) [24] 


where 
B(x) = Quy" | plleheede ps] 
In particular, [21] becomes 
u(t, x) = oR, « f (xc) + Ry * g(x) 26] 


where 
— 0M, 25 de [27] 


is the fundamental solution to the wave equation. 
The integral [27] is not an easy integral when 

n > 1, but the answer can be derived by analytic 

continuation from the Poisson kernel, that is, 


ES Ea) =P. fg) 


do (28) 
Py(x) = Cuy(|x| +?) "t 


where C, = r "* U/T((n + 1)/2). One gets 


a Cn 2 NEST, 
R(x) = iio B ; mx = (F= 181%) [29] 


Taking this limit for n=2,3 yields the formulas 
[8]-[9]. There are several ways to derive [28]. One, 
which is flexible and useful for other situations, 
derives it from the formula for the heat kernel, 


ef (x) = H; * f(x), Hi(x) = (4nt) "^ e-^/^t — [30] 


via the subordination identity: 


-yA_ Y E -y! [4t ¿18? 3/2 q 
e = ai e E t t 
0 


Ax, vl 0 [31] 
with A = V—A. The heat kernel can be computed via 
[23], which becomes a well-known Gaussian inte- 
gral. The identity [31] can be proved using the fact 
that the Fourier integral formula for P,(x) is 
elementary to compute when n= 1. 
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To understand functions of the Laplace operator 
on a cone C(N), one uses 
Y mw—180 1 
at A 
where Am is the Laplace operator on N, which 
follows from [2] and [18]. Here n = dim C(N). This is 
a modified Bessel operator. We define the operator 


AN [32] 


_ n-2 


2 
ane . 


y=(—An+a 


[33] 


For each v; in the spectrum of v, we consider the 
Hankel transform 


H,gQ) = / (Verde [34 


where /, is the Bessel function of order v;. The 
Hankel inversion formula says H,, is unitary 
on L^(R*,rdr) and is its own inverse. Conse- 
quently, we can write the action of y(V—A) on 
L?(C(N)) as 


e(V-B)gin) - f Karsimelsa)s tds — (35 
0 

where K,(r,s,v) is a family of operators on L*(N), 

given by 


Kolt 5, 2) = [rs]? u eO) r)J,(As)AdA [36] 


To obtain the wave kernel on C(N), one can 
analytically continue formulas for the Poisson 
kernel, for e?V-^, Such formulas arise from the 
Lipschitz-Hankel identity: 


[ e pr) (SA) dA 
0 


TT Patsy 
-ies Po (LAIT) og 


2rs 


Here O, 1/;(7) is a Legendre function. The identity 
[37] is one of the more difficult identities in the 
theory of Bessel functions. It is useful to know that it 
can be derived by applying a slight variant of the 
subordination identity [31] to the more elementary 
identity 


E -tX 1 —(r? --s?)/4t rs 
| PLL) J ASMA da = 5049) Lol.) (38] 
(where 1, (y) —e-?"/"J, (iy) for y > 0), which describes 
the behavior of the heat kernel on C(N). 

Carrying out the analytic continuation of [37] 
to imaginary y yields results stated in the section 
on basic propagation phenomena, once one 
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understands the behavior of families of functions of 
the operator v so produced. An approach taken by 
Cheeger and Taylor (1982) to this was to synthesize 
these operators from e”, s c R, and deduce their 
behavior from the behavior of the solution operator 
to the wave equation on the base N. 

One can apply similar considerations to M = R” M, 
which is the truncated cone [1,00) x $"-!, with 
metric tensor [18], where g is the metric tensor on 
$"-1, and Laplace operator given by [32], with An 
the Laplace operator on S” !. The problem of 
diffraction of waves by the ball B can be recast as 
solving 

Ou 
Au Au = 0 on R x M 39 
Hg xo0M = h u(t, x) =0 fort«O0 


with f compactly supported on R x OM. Taking the 
partial Fourier transform with respect to 7 yields the 
reduced wave equation 

(A+ M@)jv=0 for |x| > 1, 


ls = g(x,A) [40] 


and the condition u(t,x)=0 for t « 0 yields for v 
the outgoing radiation condition 


p"m- Dp e — ov) >0 asr-— oo [41] 
Or 


The solution is 


with v as in [33] and H") the à function. 
The behavior of Hj ()/Hj (A) as v, A— oo 
with ratio in a small Meetic dan of 1 can be 
shown to control the behavior of the solution u to 
[39] near grazing rays. There is an asymptotic 
formula for this, which is one of the most delicate 
analytical results in the theory of Bessel functions. 
The result is that, uniformly for z near 1, as 


H — oo, 
B 4C 1/4 
HP (uz) ~ 2e [3 GE) 
x {As ra oy! 
k>0 
+ Al (ui! PC) S ^ be (o) a [43] 
k>0 
Here 
A4(£) = Ai(e g) [44] 


where Ai is the Airy function. The coefficients a;,(C) 
and b,(¢) are smooth functions of their argument, 


C — C(z), which is defined by 
d =| v1 “ee [45] 


Making use of [43] in [44], one can obtain a 
parametrix for u (i.e., a solution modulo a C* error) 
whose form is a special case of the formula [50], 
which we will present in the next section. 


Geometrical Optics and Extensions 


By results of the last section, the solution to [1] 
when M =R” has the form 


lea) Y. feeds 146 
2 


where the functions 5, are produced from the initial 
data via simpler transformations. For a general 
metric tensor, one can produce a parametrix (i.e., 
an approximation to z(£,x) with a C™ error) in the 
following form: 


slt x) = Y [axe Y 
a 


Here the phase functions y*(t,x,€) are smooth for 
£ Z0 and homogeneous of degree 1 in £. The 
amplitudes a*(t,x,€) are smooth and have asympto- 
tic expansions as |£| — oo: 


N ai( (t x. €) [48] 


k>0 


“hi (EdE [47 


Elt xE) ~ 


with a; (t, x,£) homogeneous of degree —k in £. One 
applies 0? — A to both sides of [47], and obtains an 
operator of a similar form, with new amplitudes 
b*(t,x,€) ~ Y bE (t,x,€). Setting the terms in this 
asymptotic expansion equal to zero yields, first for 
Q^(t,x,£), a partial differential equation known as 
the eikonal equation: 


Op+ 

Ot 
where |v| is the norm of a vector v € T,M, 
determined by the metric tensor. Setting b; (t,x,€) 
— 0 for k > 1 yields linear differential equations for 
the amplitude terms in [48], known as transport 
equations. 

Operators of the form [47] are special cases 
of Fourier integral operators. Seminal works of 
Keller (1953) and Lax (1957) gave an important 
stimulus to work on these operators, and work of 
Hórmander (1971) turned this into a systematic and 
powerful theory. A particular advance regards 


| Vpl [49] 


producing a parametrix valid for all t. Generally, one 
can solve [49] and the associated transport equations 
for t in some interval, past which the eikonal 
equation might break down. Hórmander's theory 
treats products of Fourier integral operators, yielding 
global constructions. This facilitates the treatment of 
caustics mentioned earlier. Stationary-phase methods 
can be brought to bear to relate the singularities of 
Th to those of b, when T is a Fourier integral 
operator. 

To construct parametrices for waves reflecting off 
a boundary, one can again reduce the problem to 
one of the form [39]. Waves that reflect transver- 
sally are given by parametrices of the form [47], 
although with the role of the variables changed, so 
that £ in [47]-[49] is replaced by a coordinate that 
vanishes on R x OM. 

A parametrix that treats grazing rays can be written 
in the form of a Fourier-Airy integral operator: 


ul) = | [aa.(o) iler" Ao] 


a 
x A, (Co) | el F(€) dé [50] 


Here y=(y1,...,Yn+1) denotes a coordinate system 
on a neighborhood of a boundary point of R x M, 
with y,,1 =0 on R x OM. We have a pair of phase 
functions 0(y,£) and C(y,€), homogeneous in £ of 
degree 1 and 2/3, respectively, and a pair of 
amplitudes a(y, £) and b(y, €), each having asympto- 
tic expansions of the form [48]. The function A, is 
the Airy function [44]. The phase functions satisfy a 
coupled pair of eikonal equations: 


(Vy, V,6) "T C(V,G, VC) aa 0 [51] 

(V40, VC) = 0 
where (-,-) denotes the Lorentz inner product on 
T,(R x M) given by dt? — g. More precisely, [51] is 
to hold in the region where ¢ < 0, and also to 
infinite order at y,,1 —0, for ¢ > 0. One requires 
00/0€; to have linearly independent y-gradients, for 
fm. ess aña 


CE) = lE) = & “E, 
The terms in the asymptotic expansions of a(y,£) 
and b(y,&) satisfy coupled systems of transport 
equations. One can arrange that b(y,£)=0 for 
Yn+1=0. Then ule, — TF, where T is a Fourier 
integral operator, which can be inverted, modulo a 
smooth error, by Hórmander's theory, producing a 
parametrix for [39]. 
The construction of solutions to [51] satisfying 
[52] is due to Melrose. This followed earlier works 
of Ludwig (1967), Melrose (1975), and Taylor 


for y4,41 = 0 [52] 


Wave Equations and Diffraction 407 


(1976), which produced solutions satisfying [52] to 
infinite order at £, — 0. This earlier construction is 
adequate to produce a grazing ray parametrix, but 
the sharper result [52] is extremely valuable for 
constructing a gliding ray parametrix. This has the 
form 


uy) = | la AO +l"? AO) 
x Ai(Go) * e"F(£) de [53] 


It differs from [50] in the use of Ai rather than A,. 
Since Ai has real zeros, it is also convenient to pick 
T »0 and evaluate 0,0,a, and b at (£1,...,£,.1, 
En -- iT), and take (o — & !?(£, + iT). The treatment 
of the eikonal and transport equations is as above, 
though the Fourier-Airy integral operator [50] has a 
different behavior from [53], reflecting the differ- 
ence between how singularities in solutions to the 
wave equation are carried by grazing and by gliding 
rays. 
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Introduction about Turbulence 
and Wavelets 


What is Turbulence? 


Turbulence is a highly nonlinear regime encoun- 
tered in fluid flows. Such flows are described by 
continuous fields, for example, velocity or pressure, 
assuming that the characteristic scale of the fluid 
motions is much larger than the mean free path of 
the molecular motions. The prediction of the 
spacetime evolution of fluid flows from first 
principles is given by the solutions of the Navier- 
Stokes equations. The turbulent regime develops 
when the nonlinear term of Navier-Stokes equa- 
tions strongly dominates the linear term; the ratio 
of the norms of both terms is the Reynolds number 
Re, which characterizes the level of turbulence. In 
this regime nonlinear instabilities dominate, which 
leads to the flow sensitivity to initial conditions and 
unpredictability. 

The corresponding turbulent fields are highly 
fluctuating and their detailed motions cannot be 
predicted. However, if one assumes some statistical 
stability of the turbulence regime, averaged quan- 
tities, such as mean and variance, or other related 
quantities, for example, diffusion coefficients, lift or 
drag, may still be predicted. 

When turbulent flows are statistically stationary 
(in time) or homogeneous (in space), as it is 
classically supposed, one studies their energy spec- 
trum, given by the modulus of the Fourier transform 
of the velocity autocorrelation. 

Unfortunately, since the Fourier representation 
spreads the information in physical space among the 
phases of all Fourier coefficients, the energy spec- 
trum loses all structural information in time or 
space. This is a major limitation of the classical way 
of analyzing turbulent flows. This is why we have 
proposed to use the wavelet representation instead 
and define new analysis tools that are able to 
preserve time and space locality. 

The same is true for computing turbulent flows. 
Indeed, the Fourier representation is well suited to 
study linear motions, for which the superposition 
principle holds and whose generic behavior is, either 
to persist at a given scale, or to spread to larger 
ones. In contrast, the superposition principle does 
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not hold for nonlinear motions, their archetype 
being the turbulent regime, which therefore cannot 
be decomposed into a sum of independent motions 
that can be separately studied. Generically, their 
evolution involves a wide range of scales, exciting 
smaller and smaller ones, even leading to finite-time 
singularities, e.g., shocks. The “art” of predicting 
the evolution of such nonlinear phenomena consists 
of disentangling the active from the passive 
elements: the former should be deterministically 
computed, while the latter could either be discarded 
or their effect statistically modeled. The wavelet 
representation allows to analyze the dynamics 
in both space and scale, retaining only those degrees 
of freedom which are essential to predict the 
flow evolution. Our goal is to perform a kind 
of "distillation" and retain only the elements 
which are essential to compute the nonlinear 
dynamics. 


How One Studies Turbulence? 


When studying turbulence one is uneasy about the 
fact that there are two different descriptions, 
depending on which side of the Fourier transform 
one looks from. 


e On the one hand, looking from the Fourier space 
representation, one has a theory which assumes 
the existence of a nonlinear cascade in an 
intermediate range of wavenumbers sets, called 
the “inertial range" where energy is conserved 
and transferred towards high wavenumbers, but 
only on average (i.e., considering either ensemble 
or time or space averages). This implies that a 
turbulent flow is excited at wavenumbers lower 
than those of the inertial range and dissipated at 
wavenumbers higher. Under these hypotheses, the 
theory predicts that the slope of the energy 
spectrum in the inertial range scales as k^ ?/? in 
dimension 3 and as &^? in dimension 2, k being 
the wavenumber, i.e., the modulus of the wave 
vector. 

e On the other hand, if one studies turbulence from 
the physical space representation, there is not yet 
any universal theory. One relies instead on 
empirical observations, from both laboratory 
and numerical experiments, which exhibit the 
formation and persistence of coherent vortices, 
even at very high Reynolds numbers. They 
correspond to the condensation of the vorticity 
field into some organized structures that contain 
most of the energy (L?^-norm of velocity) and 
enstrophy (L?-norm of vorticity), 


Moreover, the classical method for modeling turbu- 
lent flows consists in neglecting high-wavenumber 
motions and replacing them by their average, suppos- 
ing their dynamics to be either linear or slaved to the 
low wavenumber motions. Such a method would work 
if there exists a clear separation between low and high 
wavenumbers, that is, a spectral gap. 

Actually, there is now strong evidence, from 
both laboratory and direct numerical simulation 
(DNS) experiments, that this is not the case. 
Conversely, one observes that turbulent flows are 
nonlinearly active all along the inertial range and that 
coherent vortices seem to play an essential dynamical 
role there, especially for transport and mixing. One 
may then ask the following questions: Are coherent 
vortices the elementary building blocks of turbulent 
flows? How can we extract them? Do their mutual 
interactions have a universal character? Can we 
compress turbulent flows and compute their evolu- 
tion with a reduced number of degrees of freedom 
corresponding to the coherent vortices? 

The DNS of turbulent flows, based on the integra- 
tion of the Navier-Stokes equations using either grid 
points in physical space or Fourier modes in spectral 
space, requires a number of degrees of freedom per 
time step that varies as Re’/* in dimension 3 (and as 
Re in dimension 2). Due to the inherent limitation of 
computer performances, one can presently only per- 
form DNS of turbulent flows up to Reynolds numbers 
Re=10°. To compute higher Reynolds flows, one 
should then design ad hoc turbulence models, whose 
parameters are empirically adjusted to each type of 
flows, in particular to their geometry and boundary 
conditions, using data from either laboratory or 
numerical experiments. 


What are Wavelets? 


The wavelet transform unfolds signals (or fields) 
into both time (or space) and scale, and possibly 
directions in dimensions higher than 1. The starting 
point is a function 4 € L^(R), called the “mother 
wavelet", which is well localized in physical space 
xER, is oscillating ( has at least a vanishing 
integral, or better, its first » moments vanish), and 
is smooth (its Fourier transform 4(%) exhibits fast 
decay for wave numbers |{k| tending to infinity). The 
mother wavelet then generates a family of dilated 
and translated wavelets 


| -1/2,, (6 
vale) 


a 


with a € R^ the scale parameter and b €R the 
position parameter, all wavelets being normalized 
in L?-norm. 
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The wavelet transform of a function f € L?(R) is 
the inner product of f with the analyzing wavelets 
Yap, Which gives the wavelet coefficients: f(a, b) = 
(f, a,b) = f f(x), p(x) dx. They measure the fluc- 
tuations of f around the scale a and the position 
b. f can then be reconstructed without any loss as 
the inner product of its wavelet coefficients f with 
the analyzing wavelets 


bas: f(x) - C, ^ / J Fla, b) 3h, s (x)a *dadb 


C, — f |o |k| ! dk being a constant which depends 
on the wavelet v. 

Like the Fourier transform, the wavelet transform 
realizes a change of basis from physical space to 
wavelet space which is an isometry. It thus conserves 
the inner product (Plancherel theorem), and in 
particular energy (Parseval's identity). Let us men- 
tion that, due to the localization of wavelets in 
physical space, the behavior of the signal at infinity 
does not play any role. Therefore, the wavelet 
analysis and synthesis can be performed locally, in 
contrast to the Fourier transform where the nonlocal 
nature of the trigonometric functions does not allow 
to perform a local analysis. 

Moreover, wavelets constitute building blocks of 
various function spaces out of which some can be 
used to contruct orthogonal bases. The main 
difference between the continuous and the orthogo- 
nal wavelet transforms is that the latter is non- 
redundant, but only preserves the invariance by 
translation and dilation only for a discrete subset of 
wavelet space which corresponds to the dyadic grid 
A= (j,i), for which scale is sampled by octaves j and 
space by positions 2 /¡. The advantage is that all 
orthogonal wavelet coefficients are decorrelated, 
which is not the case for the continuous wavelet 
transform whose coefficients are redundant and 
correlated in space and scale. Such a correlation 
can be visualized by plotting the continuous wavelet 
coefficients of a white noise and the patterns one 
thus observes are due to the reproducing kernel of 
the continuous wavelet transform, which corre- 
sponds to the correlation between the analyzing 
wavelets themselves. 

In practice, to analyze turbulent signals or fields, 
one should use the continuous wavelet transform 
with complex-valued wavelets, since the modulus of 
the wavelet coefficients allows to read the evolution 
of the energy density in both space (or time) and 
scales. If one uses real-valued wavelets instead, the 
modulus of the wavelet coefficients will present the 
same oscillations as the analyzing wavelets and it 
will then become difficult to sort out features 
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belonging to the signal or to the wavelets. In the case 
of complex-valued wavelets, the quadrature between 
the real and the imaginary parts of the wavelet 
coefficients eliminates these spurious oscillations; this 
is why we recommend to use complex-valued wave- 
lets, such as the Morlet wavelet. To compress 
turbulent flows, and a fortiori to compute their 
evolution at a reduced cost, compared to standard 
methods (finite difference, finite volume, or spectral 
methods), one should use orthogonal wavelets. This 
avoids redundancy, since one has the same number of 
grid points as wavelet coefficients. Moreover there 
exists a fast algorithm to compute the orthogonal 
wavelet coefficients which is even faster than the fast 
Fourier transform, having O(N) operations instead of 
O(N log, N). 

The first paper about the continuous wavelet 
transform has been published by Grossmann and 
Morlet (1984). Then, discrete wavelets were 
constructed, leading to frames (Daubechies et al. 
1986) and orthogonal bases (Lemarié and Meyer, 
1986). From there the formalism of multiresolution 
analysis (MRA) has been constructed which led 
to the fast wavelet algorithm (Mallat 1989). The 
first application of wavelets to analyze turbulent 
flows has been published by Farge and Rabreau 
(1988). Since then a long-term research program has 
been developed for analyzing, computing and 
modeling turbulent flows using either continuous 
wavelets, orthogonal wavelets, or wavelet packets. 


Wavelet Analysis 
Wavelet Spectra 


Wavelet space To study turbulent signals one uses 
the continuous wavelet transform for analysis, and 
the orthogonal wavelet transform for compression 
and computation. To perform a continuous wavelet 
transform, one can choose: 


e either a real-valued wavelet, such as the Marr 


wavelet, also called “Mexican hat,” which is the 
second derivative of a Gaussian, 
— x2 
66) = (1 = 3?) exp) " 


* or a complex-valued wavelet, such as the Morlet 
wavelet, 


Su 1d (k — kyy 
Mosie LP) naso 


~ 


p(k) =0 fork <0 


with the wavenumber ky, denoting the barycenter of 
the wavelet support in Fourier space computed as 


T: n k|u(R)|dk 3) 
Jo \w(R) dk 


For the orthogonal wavelet transform, there is 
a large collection of possible wavelets and the 
choice depends on which properties are preferred, 
for instance: compact support, symmetry, smooth- 
ness, number of cancelations, computational 
efficiency. 

From our own experience, we tend to prefer 
the Coifman wavelet 12, which is compactly 
supported, has four vanishing moments, is quasi- 
symmetric, and is defined with a filter of length 12, 
which leads to a computational cost for the fast 
wavelet transform in 24N operations, since two 
filters are used. 

As stated above, we recommend the complex- 
valued continuous wavelet transform for analysis. In 
this case, one plots the modulus and the phase of the 
wavelet coefficients in wavelet space, with a linear 
horizontal axis for the position b, and a logarithmic 
vertical axis for the scale a, with the largest scale at 
the bottom and the smallest scale at the top. 

In Figure 1a we show the wavelet analysis of 
a turbulent signal, corresponding to the time 
evolution of the velocity fluctuations of two succes- 
sive vortex breakdowns, measured by hot-wire 
anemometry at N —32768 —2P instants (Cuypers 
et al. 2003). The modulus of the wavelet coefficients 
(Figure 1b) shows that during the vortex break- 
down, which is due to strong nonlinear flow 
instability, energy is spread over a wide range of 
scales. The phase of the wavelet coefficients 
(Figure 1c) is plotted only where the modulus is 
non-negligible, otherwise the phase information 
would be meaningless. In Figure 1c, one observes 
that the lines of constant phase point towards the 
instants where the signal is less regular, that is, 
during vortex breakdowns. 


Local wavelet spectrum Since the wavelet trans- 
form conserves energy and preserves locality in 
physical space, one can extend the concept of energy 
spectrum and define a local energy spectrum, such 
that 


E(k,x) = i 


where k, is the centroid wavenumber of the 
analyzing wavelet Y and C, is defined in the 


0 0.5 1 1.5 2 25 3 
x 104 


x 104 
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Figure 1 Example of a one-dimensional continuous wavelet 
analysis. (a) the signal to be analyzed, (b) the modulus of its 
wavelet coefficients, (c) the phase of its wavelet coefficients. 
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admissibility condition (respectively, eqns [10] and 
[1] in the article Wavelets: Mathematical Theory). 

By measuring E(k,x) at different instants or 
positions, one estimates which elements in the 
signal contribute most to the global Fourier energy 
spectrum, inorder to suggest a way to decompose 
the signal into different components. For example, 
if one considers turbulent flows, one can compare 
the energy spectrum of the coherent structures 
(such as isolated vortices in incompressible flows 
or shocks in compressible flows) and the energy 
spectrum of the incoherent background flow, since 
both elements exhibit different correlations and 
therefore different spectral slopes. 


Global wavelet spectrum Although the wavelet 
transform analyzes the flow using localized func- 
tions rather than complex exponentials, one can 
show that the global wavelet energy spectrum 
converges towards the Fourier energy spectrum, 
provided the analyzing wavelet has enough vanish- 
ing moments. More precisely, the global wavelet 
spectrum, defined by integrating [4] over all 
positions, 


E(k) = f Elda 5] 


x 


gives the correct exponent for a power-law Fourier 
energy spectrum E(k) c k~” if the analyzing wavelet 
has at least M »(8—1)/2 vanishing moments. 
Thus, the steeper the energy spectrum one studies, 
the more vanishing moments the analyzing wavelet 
should have. 

The inertial range which corresponds to the scales 
when turbulent flows are dominated by nonlinear 
interactions, exhibits a power-law behavior as 
predicted by the statistical theory of homogeneous 
and isotropic turbulence. 

The ability to correctly evaluate the slope of the 
energy spectrum is an important property of the 
wavelet transform which is related to its ability to 
detect and characterize singularities. We will not 
discuss here how wavelet coefficients could be used 
to study singularities and fractal measures, since it is 
presented in detail elsewhere (see Wavelets: 
Applications). 


Relation to Classical Analysis 


Relation to Fourier spectrum The global wavelet 
energy spectrum E(k) is actually a smoothed version 
of the Fourier energy spectrum E(k). This can be 
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seen from the following relation between the two 


spectra: 
aa 1 FO ll 


which shows that the global wavelet spectrum is an 
average of the Fourier spectrum weighted by the 
square of the Fourier transform of the analyzing 
wavelets at wavenumber k. Note that the larger k, 
the larger the averaging interval, because wavelets 
are bandpass filters with Ak/k constant. This 
property of the global wavelet energy spectrum is 
particularly useful to study turbulent flows. Indeed, 
the Fourier energy spectrum of a single realization 
of a turbulent flow is too oscillating to be able to 
clearly detect a slope, while it is no more the case 
for the global wavelet energy spectrum, which is a 
better estimator of the spectral slope. 

The real-valued Marr wavelet [1] has only two 
vanishing moments and thus can correctly measure 
the energy spectrum exponents up to 8 < 5. In the 
case of the complex-valued Morlet wavelet [2], only 
the zeroth-order moment is null, but the higher mth 
order moments are very small (x k» e -Ky/2)) 
provided that ky is larger than 5. For instance, the 
Morlet wavelet transform with k,,=6 gives accu- 
rate estimates of the power-law exponent of the 
energy spectrum up to 5 < 7. 

There is also a family of wavelets with an infinite 
number of cancelations 


2 
dk [6| 


Un(k) = a, exp E G + g:)) ger [A 


where a,, is chosen for normalization. 

These wavelets can therefore correctly measure 
any power-law energy spectrum, and thus detect the 
difference between a power-law energy spectrum 
and a Gaussian energy spectrum (E(k) x e''*/40))), 
For instance, it is important in turbulence to 
determine the wavenumber after which the 
energy spectrum decays exponentially, since this 
wavenumber defines the end of the inertial range, 
dominated by nonlinear interactions, and the begin- 
ning of the dissipative range, dominated by linear 
dissipation. 


Relation to structure functions In this subsection 
we will point out the limitations of classical 
measures of intermittency and present a set of 
wavelet-based alternatives. 


The classical measures based on structure func- 
tions can be thought of as a special case of wavelet 
filtering using a nonsmooth wavelet defined as the 
difference of two Diracs (DOD). It is this lack of 
regularity of the underlying wavelet that limits the 
adequacy of classical measures to analyze smooth 
signals. Wavelet-based diagnostics can overcome 
these limitations, and produce accurate results, 
whatever the signal to be analyzed. 

We will link the scale-dependent moments of the 
wavelet coefficients and the structure functions, 
which are classically used to study turbulence. In 
the case of second-order statistics, the global wavelet 
spectrum corresponds to the second-order structure 
function. Furthermore, a rigorous bound for the 
maximum exponent detected by the structure func- 
tions can be computed, but there is a way to 
overcome this limitation by using wavelets. 

The increments of a signal, also called the 
modulus of continuity, can be seen as its wavelet 
coefficients using the DOD wavelet 


w(x) = (x + 1) — d(x) [8] 


We thus obtain 


f(x +a) — f(x) = faa = (f uS.) [9] 


with | wx,a(¥) = 1/a[ó((y — x)/a + 1) —é((y — x)/a)]. 
Note that the wavelet is normalized with respect to 
the L'-norm. The pth-order structure function S, (a) 
therefore corresponds to the pth-order moment of 
the wavelet coefficients at scale a 


Sp(a) = J (fea)? dx [10] 


As the DOD wavelet has only one vanishing 
moment (its mean), the exponent of the pth-order 
structure function in the case of a self-similar 
behavior is limited by p, that is, if Sp(a) x as”, 
then (p) < p. To be able to detect larger exponents, 
one has to use increments with a larger stencil, or 
wavelets with more vanishing moments. 

We now concentrate on the case p — 2, that is, the 
energy norm. Equation [6] gives the relation 
between the global wavelet spectrum E(k) and the 
Fourier spectrum E(k) for an arbitrary wavelet v. 
For the DOD wavelet we find, since PU) = 
et —1 —eik/2(eik/2 — eik/2) and hence |*(k)|" = 
2(1 — cos k), that 


1 f kyk'\\ ., 
zx] Ek) (2 - 2eos( A ) )ak 11] 


E(k) — 


Setting a=k,,/k, we see that the wavelet spectrum 
corresponds to the second-order structure function, 
such that 


- 1 


E(k) = E p24) [12] 


The above results show that, if the Fourier spectrum 
behaves like &^* for k= oo, E(k) xk“ ifa <2M+ 
1, where M denotes the number of vanishing 
moments of the wavelets. Consequently, we find 
for S5(a) that $5(a) ox aS?) =(k,,/k)S”’ for a—0 if 
((2) € 2M. For the DOD wavelet, we have M — 1, 
therefore, the  second-order structure function 
can only detect slopes smaller than 2, corresponding 
to an energy spectrum whose slope is shallower 
than —3. Thus, the usual structure functions give 
spurious results for sufficiently smooth signals. The 
relation between structure functions and wavelet 
coefficients can be generalized in the context of 
Besov spaces, which are classically used for non- 
linear approximation theory (see Wavelets: Mathe- 
matical Theory). 


Intermittency Measures 


Intermittency is defined as localized bursts of high- 
frequency activity. This means that intermittent 
phenomena are localized in both physical and 
spectral spaces, and thus a suitable basis for 
representing intermittency should reflect this dual 
localization. The Fourier basis is well localized in 
spectral space, but delocalized in physical space. 
Therefore, when a turbulence signal is filtered using 
a high-pass Fourier transform and then recon- 
structed in physical space, for example, to calculate 
the flatness, some spatial information is lost. This 
leads to smoothing of strong gradients and spurious 
oscillations in the background, which come from the 
fact that the modulus and phase of the discarded 
high wavenumber Fourier modes have been lost. 
The spatial errors introduced by such a Fourier 
filtering lead to errors in estimating the flatness, and 
hence the signal's intermittency. 

When a quantity (e.g., velocity derivative) is 
intermittent, it contains rare but strong events (i.e., 
bursts of intense activity), which correspond to 
large deviations reflected in the “heavy tails" of the 
PDF. Second-order statistics (e.g., energy spectrum, 
second-order structure function) are relatively 
insensitive to such rare events whose time or 
space supports are very small and thus do not 
dominate the integral. However, these events 
become increasingly important for higher-order 
statistics, where they finally dominate. High-order 
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statistics therefore characterize intermittency. Of 
course, intermittency is not essential for all problems: 
second-order statistics are sufficient to measure 
dispersion (dominated by energy-containing scales), 
but not to calculate drag or mixing (dominated by 
vorticity production in thin boundary or shear 
layers). 

To measure intermittency, one uses the space- 
scale information contained in the wavelet coeffi- 
cients to define scale-dependent moments and 
moment ratios. Useful diagnostics to quantify the 
intermittency of a field f are the moments of its 
wavelet coefficients at different scales j 


2i—1 
Myj(f) 2 27 Y- fil [13] 

i=0 
Note that the distribution of energy scale by scale, 
that is, the scalogram, can be computed from the 
second-order moment of the orthogonal wavelet 
coefficients: E; =2/! M5 ;. Due to orthogonality of 
the decomposition, the total energy is just the sum: 

E= 5 j>0 Ej- 

The sparsity of the wavelet coefficients at each 
scale is a measure of intermittency, and it can be 
quantified using ratios of moments at different 
scales 


(ey — Moi) 14 
Op.qj (f) (Maj (f) yp/a [ | 


which may be interpreted as quotient norms 
computed in two different functional spaces, 
L?-and L4-spaces. Classically, one chooses 4 — 2 to 
define typical statistical quantities as a function of 
scale. Recall that for p —4 we obtain the scale- 
dependent flatness F; = O4 ».;. It is equal to 3 for a 
Gaussian white noise at all scales j, which proves that 
this signal is not intermittent. The scale-dependent 
skewness, hyperflatness, and hyperskewness are 
obtained for p=3,5, and 6, respectively. For inter- 
mittent signals O, 4; increases with j, whatever p 
and q. 


Wavelet Compression 
Principle 


To study turbulent signals, we now propose to 
separate the rare and extreme events from the dense 
events, and then calculate their statistics indepen- 
dently. A major difficulty in turbulence research is 
that there is no clear scale separation between these 
two kinds of events. This lack of “spectral gap” 
excludes Fourier filtering for disentangling these 
two behaviors. Since the rare events are well 
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localized in physical space, one might try to use an 
on-off filter defined in physical space to extract 
them. However, this approach changes the spectral 
properties by introducing spurious discontinuities, 
adding an artificial scaling (e.g., k>? in one 
dimension) to the energy spectrum. To avoid these 
problems, we use the wavelet representation, which 
combines both physical and spectral space localiza- 
tions (bounded from below by Heisenberg’s uncer- 
tainty principle). In turbulence, the relevant rare 
events are the coherent vortices and the dense 
events correspond to the residual background flow. 
We have proposed a nonlinear wavelet filtering of 
the wavelet coefficients of vorticity to extract the 
coherent vortices out of turbulent flows. We now 
detail the different steps of this procedure. 


Extraction of Coherent Structures 


Principle We propose a new method to extract 
coherent structures from turbulent flows, as encoun- 
tered in fluids (e.g., vortices, shocklets) or plasmas 
(e.g., bursts), in order to study their role in transport 
and mixing. 

We first replace the Fourier representation by the 
wavelet representation, which keeps track of both 
time and scale, instead of frequency only. The 
second improvement consists in changing our view- 
point about coherent structures. Since there is not 
yet a universal definition of coherent structures, we 
prefer starting from a minimal but more consensual 
statement about them, that everyone hopefully could 
agree with: “coherent structures are not noise.” 
Using this apophatic method, we propose the 
following definition: “coherent structures are what 
remain after denoising.” 

For the noise we use the mathematical definition 
stating that a noise cannot be compressed in any 
functional basis. Another way to say this is to 
observe that the shortest description of a noise is the 
noise itself. Notice that often one calls “noise” what 
is actually “experimental noise,” but not noise in the 
mathematical sense. 

Considering our definition of coherent structures, 
turbulent signals can be split into two contribu- 
tions: coherent bursts, corresponding to that part of 
the signal which can be compressed in a wavelet 
basis, and incoherent noise, corresponding to that 
part of the signal which cannot be compressed, 
neither in wavelets nor in any other basis. We will 
then check a posteriori that the incoherent con- 
tribution is spread, and therefore does not com- 
press, in both Fourier and grid-point basis. Since we 
use the orthogonal wavelet representation, both 
coherent and incoherent components are 


orthogonal and therefore the L?-norm, for example, 
energy or enstrophy, is a superposition of coherent 
and incoherent contributions (Mallat 1998). 

Assuming that coherent structures are what 
remain after denoising, we need a model, not for 
the structures themselves, but for the noise. As a first 
guess, we choose the simplest model and suppose the 
noise to be additive, Gaussian and white, that is, 
uncorrelated. Having this model in mind, we use 
Donoho and Johnstone’s theorem to compute the 
value to threshold the wavelet coefficients. Since the 
threshold value depends on the variance of the noise, 
which in the case of turbulence is not a priori 
known, we propose a recursive method to estimate 
it from the variance of the weakest wavelet 
coefficients, that is, those whose modulus is below 
the threshold value. 


Wavelet decomposition We describe the wavelet 
algorithm to extract coherent vortices out of 
turbulent flows and apply it as example to a 3D 
turbulent flow. We consider the vorticity field 
@= V x v, computed at resolution N — 27, N being 
the number of grid points and / the number of 
octaves in each spatial direction. Each vorticity 
component is developed into an orthogonal wavelet 
series from the largest scale Ja, — 2? to the smallest 
scale [min — 2/^! using a three-dimensional (3D) MRA: 


w(x) — 00.0.0 60.0.0(X) 
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where 6ó;; and wj; are the one-dimensional 

scaling function. and the corresponding wavelet, 

respectively. Due to orthogonality, the scaling coeffi- 

cients are given by Wo. 0.0 = (w, d0.0.0) and the wavelet 
Ta" . ET e^ | Od 

coefficients are given by 2^; ; ; = (w, 0%, ; i)» where 


(-,-) denotes the L?-inner product. 


Nonlinear thresholding The vorticity field is then 
split into @c and 01 by applying a nonlinear threshold- 
ing to the wavelet coefficients. The threshold is defined 
as e=(¿ZinNy”. It only depends on the total 
enstrophy Z= 1 f [|?dx and on the number of grid 
points N without any adjustable parameter. The choice 
of this threshold is based on theorems by Donoho 
and Johnstone proving optimality of the wavelet 
representation to denoise signals in the presence of 
Gaussian white noise, since this wavelet-based 
estimator minimizes the maximal L?-error for func- 
tions with inhomogeneous regularity (Mallat 1998). 


Wavelet reconstruction The coherent vorticity field 
@c is reconstructed from the wavelet coefficients 
whose modulus is larger than e and the incoherent 
vorticity field @; from the wavelet coefficients whose 
modulus is smaller or equal to e. The two fields thus 
obtained, @c and 01, are orthogonal, which ensures 
a separation of the total enstrophy into Z = Zc + Zi 
because the interaction term (@c,@,) vanishes. We 
then use Biot-Savart's relation v — V x (V9) to 
reconstruct the coherent velocity vc and the inco- 
herent velocity v; from the coherent and incoherent 
vorticities, respectively. 


Application to 3D Turbulence 


We consider a 3D homogeneous isotropic turbulent 
flow, computed by DNS at resolution N=2567, 
which corresponds to a Reynolds number based 
on the Taylor microscale R,— 168 (Farge et al. 
2003). The computation uses a pseudospectral 
code, with a Gaussian random vorticity field as initial 
condition, and the flow evolution is integrated until a 
statistically stationary state is reached. Figure 2 shows 
the modulus of the vorticity fluctuations of the total 
flow, zooming on a 64? subcube to enhance structural 
details. The flow exhibits elongated, distorted, and 
folded vortex tubes, as observed in laboratory and 
numerical experiments. 

We apply to the total flow the wavelet compres- 
sion algorithm described above. We find that only 
2.9% wavelet modes correspond to the coherent 
flow, which retains 79% of the energy (L7-norm of 
velocity) and 75% of the enstrophy (L*-norm of 
vorticity), while the remaining 97.1% incoherent 
modes contain only 1% of the energy and 21% of 
the enstrophy. We display the modulus of the 
coherent (Figure 3) and incoherent (Figure 4) vorti- 
city fluctuations resulting from the wavelet 
decomposition. 

Note that the values of the three isosurfaces chosen 
for visualization (|w| =6Z*/?, 8Z!/? and 10Z'/7, with 
Z the total enstrophy) are the same for the total and 
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Figure 2 lsosurfaces of total vorticity field, for 
l| — 3c, 4c, 5c with opacity 1, 0.5, 0.1, respectively, and c? the 
total enstrophy. Simulation with resolution N = 256? for R, — 168. 
Zoom on a subcube 64?. Reprinted with permission from Farge 
et al. Coherent vortex extraction in three-dimensional homo- 
geneous turbulence: Comparison between CVS-wavelet and 
POD-Fourier decompositions. Physics of Fluids 15(10): 2886- 
2896. Copyright 2003, American Institute of Physics. 


coherent vorticities, but they have been reduced by a 
factor 2 for the incoherent vorticity whose fluctuations 
are much smaller. In the coherent vorticity (Figure 3) 
we recognize the same vortex tubes as those present in 
the total vorticity (Figure 2). In contrast, the remaining 
vorticity (Figure 4) is much more homogeneous and 


Isosurfaces of 


Figure 3 coherent  vorticity field, for 
l| = 3e, 4c, 5g with opacity 1, 0.5, 0.1, respectively. Simulation 
with resolution N — 256?. Zoom on a subcube 64?. Reprinted with 
permission from Farge et al. Coherent vortex extraction in three- 
dimensional homogeneous turbulence: Comparison between CVS- 
wavelet and POD-Fourier decompositions. Physics of Fluids 
15(10): 2886-2896. Copyright 2003, American Institute of Physics. 
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Figure 4 lIsosurfaces of incoherent vorticity field, for 
|\@| = 3/20, 20, 5/20 with opacity 1, 0.5, 0.1, respectively. Simula- 
tion with resolution N = 256%. Zoom on a subcube 64°. Reprinted 
with permission from Farge et al. Coherent vortex extraction in 
three-dimensional homogeneous turbulence: Comparison between 
CVS-wavelet and POD-Fourier decompositions. Physics of Fluids 
15(10): 2886-2896. Copyright 2003, American Institute of Physics. 


does not exhibit coherent structures. Hence, the 
wavelet compression retains all the vortex tubes and 
preserves their structure at all scales. Consequently, the 
coherent flow is as intermittent as the total flow, while 
the incoherent flow is structureless and non intermit- 
tent. Modeling the effect of the incoherent flow onto 
the coherent flow should then be much simpler than 
with methods based on Fourier filtering. 

Figure 5 shows the velocity PDF in semilogarithmic 
coordinates. We observe that the coherent velocity has 


1 a Coherent —— 
| Incoherent 
0.1 Gaussian fit 


5 —20 —10 0 10 20 30 
Figure 5 "Velocity PDF, resolution N —256? with a zoom at 
64?. Reprinted with permission from Farge et al. Coherent vortex 
extraction in three-dimensional homogeneous turbulence: Com- 
parison between CVS-wavelet and POD-Fourier decomposi- 
tions. Physics of Fluids 15(10): 2886-2896. Copyright 2003, 
American Institute of Physics. 
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Figure 6 Energy spectrum, resolution N —256? with a 
zoom at 64?. Reprinted with permission from Farge et al. 
Coherent vortex extraction in three-dimensional homogeneous 
turbulence: Comparison between CVS-wavelet and POD-Fourier 
decompositions. Physics of Fluids 15(10): 2886-2896. Copyright 
2003, American Institute of Physics. 


the same Gaussian distribution as the total velocity, 
while the incoherent velocity remains Gaussian, but its 
variance is much smaller. The corresponding energy 
spectra are plotted on Figure 6. We observe that the 
spectrum of the coherent energy is identical to the 
spectrum of the total energy all along the inertial 
range. This implies that the vortex tubes are respon- 
sible for the k ?/? energy scaling, which corresponds to 
a long-range correlation, characteristic of 3D turbu- 
lence as predicted by Kolmogorov's theory. In con- 
trast, the incoherent energy has a scaling close to k?, 
which corresponds to an energy equipartition between 
all wave vectors k, since the isotropic spectrum is 
obtained by integrating energy in 3D k-space over 2D 
shells k =|k|. The incoherent velocity field is therefore 
spatially uncorrelated, which is consistent with the 
observation that incoherent vorticity is structureless 
and homogeneous. 

From these observations, we propose the following 
scenario to interpret the turbulent cascade: the 
coherent energy injected at large scales is transferred 
towards small scales by nonlinear interactions between 
vortex tubes. In the meantime, these nonlinear inter- 
actions also produce incoherent energy at all scales, 
which is dissipated at the smallest scales by molecular 
kinematic viscosity. Thus, the coherent flow causes 
direct transfer of the coherent energy into incoherent 
energy. Conversely, the incoherent flow does not 
trigger any energy transfer to the coherent flow, as it 
is structureless and uncorrelated. We conjecture that 
the coherent flow is dynamically active, while the 
incoherent flow is slaved to it, being only passively 
advected and mixed by the coherent vortex tubes. This 
is a different view from the classical interpretation 
since it does not suppose any scale separation. Both 
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coherent and incoherent flows are active all along the 
inertial range, but they are characterized by different 
probability distribution functions and correlations: 
non-Gaussian and long-range correlated for the 
former, while Gaussian and uncorrelated for the latter. 


Wavelet Computation 
Principle 


The mathematical properties of wavelets (see Wave- 
lets: Mathematical Theory) motivate their use for 
solving of partial differential equations (PDEs). 

The localization of wavelets, both in scale and 
space, leads to effective sparse representations of 
functions and pseudodifferential operators (and their 
inverse) by performing nonlinear thresholding of the 
wavelet coefficients of the function and of the matrices 
representing the operators. Wavelet coefficients allow 
to estimate the local regularity of solutions of PDEs 
and thus can define autoadaptive discretizations with 
local mesh refinements. The characterization of func- 
tion spaces in terms of wavelet coefficients and the 
corresponding norm equivalences lead to diagonal 
preconditioning of operators in wavelet space. 

Moreover, the existence of the fast wavelet trans- 
form yields algorithms with optimal linear complex- 
ity. The currently existing algorithms can be 
classified in different ways. We can distinguish 
between Galerkin, collocation, and hybrid schemes. 
Hybrid schemes combine classical discretizations, 
for example, finite differences or finite volumes, and 
wavelets, which are only used to speed up the linear 
algebra and to define adaptive grids. On the other 
hand, Galerkin and collocation schemes employ 
wavelets directly for the discretization of the 
solution and the operators. Wavelet methods have 
been developed to solve Burger’s, Stokes, Kura- 
moto-Sivashinsky, nonlinear Schrödinger, Euler, 
and Navier-Stokes equations. As an example, we 
present an adaptive wavelet algorithm, of Galerkin 
type, to solve the 2D Navier-Stokes equations. 


Adaptive Wavelet Scheme 


We consider the 2D Navier-Stokes equations writ- 
ten in terms of vorticity w and stream function W, 
which are both scalars in two dimensions, 


Dut v Vu- tV w= N x F [17] 
V^y —w5 and v—ViW [18] 


for x € [0, 1]^,£ > 0. The velocity is denoted by v, F 
is an external force, v > 0 is the molecular kinematic 
viscosity, and V+ = (— ôy, Ox). 


The above equations are completed with bound- 
ary conditions and a suitable initial condition. 


Time discretization Introducing a classical semi- 
implicit time discretization with a time step At and 
setting w"(x) = w(x,nAt), we obtain 


(1—vAtVv? wo" =u" AKV xF —v" Va”) [19] 


vy" - "ou and VH - virt! [20] 

Hence, in each time step two elliptic problems 
have to be solved and a differential operator has to 
be applied. 

Formally the above equations can be written in 
the abstract form Lu=f, where L is an elliptic 
operator with constant coefficients. This corre- 
sponds to a Helmholtz type equation for w with 
L — (1 — vAtV^?) and a Poisson equation for V with 
Law. 


Spatial discretization For the spatial discretization, 
we use the method of weighted residuals, that is, a 
Petrov-Galerkin scheme. The trial functions 
are orthogonal wavelets @ and the test functions 
are operator adapted wavelets, called “vaguelettes,” 
0. To solve the elliptic equation Lu=f at time 
step 7£"^!, we develop 4"'' into an orthogonal 
wavelet series, that is, u^! = Y^, at! yy, where 
A= (j,ix,iy,d) denotes the multi-index for scale j, 
space i, and direction d. Requiring that the residual 
vanishes with respect to all test functions 04, we 
obtain a linear system for the unknown wavelet 


coefficients 7/77*! of the solution u: 


Sot is x) = (f 6») 21] 
À 


The test functions 0 are defined such that the 
stiffness matrix turns out to be the identity. 
Therefore, the solution of Lu=f reduces to a 
change of basis, that is, u”! — M^, (f, 0,) v. The 
right-hand side (RHS) f can then be developed into a 
biorthogonal operator adapted wavelet 
basis f-—Y(f,0)0, with 0,—L*!wv, and 
C —Lwy, * denoting the adjoint operator. By 
construction, 0 and ( are biorthogonal, that is, 
such that (01,0v)=01y. It can be shown that 
both have similar localization properties in physical 
and Fourier space as 4, and that they form a Riesz 
basis. 


Adaptive discretization To get an adaptive space 
discretization for the linear problem Lu=f, we 
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Figure 7 Illustration of the dynamic adaption strategy in 
wavelet coefficient space. 


consider only the significant wavelet coefficients of 
the solution. Hence, we only retain coefficients %4% 
whose modulus is larger than a given threshold e, 
that is, || >e. The corresponding coefficients 
are shown in Figure 7 (white area under the solid 
line curve). 


Adaption strategy To be able to integrate the 
equation in time we have to account for the 
evolution of the solution in wavelet coefficient 
space (indicated by the arrow in Figure 7). There- 
fore, we add at time step 7" the neighbors to the 
retained coefficients, which constitute a security 
zone (gray area in Figure 7). The equation is then 
solved in this enlarged coefficient set (white and 
gray areas below the curves in Figure 7) to obtain 
4. Subsequently, we threshold the coefficients 
and retain only those whose modulus |77'!| >e 
(coefficients under the dashed curve in Figure 7). 
This strategy is applied in each time step and hence 
allows to automatically track the evolution of the 
solution in both scale and space. 


Evaluation of the nonlinear term For the 
evaluation of the nonlinear term f(u"), where the 
wavelet coefficients 4” are given, there are two 
possibilities: 


e Evaluation in wavelet coefficient space. As 
illustration, we consider a quadratic nonlinear 
term, f(u) — w^. The wavelet coefficients of f can 
be calculated using the connection coefficients, 
that is, one has to calculate the bilinear expres- 
sion, Y») > y Max dy with the interaction 
tensor Tayy = (wy uy, 0w). Although many coeffi- 
cients of Z are zero or very small, the size of T 
leads to a computation which is quite untractable 
in practice. 

e Evaluation in physical space. This approach is 
similar to the pseudospectral evaluation of the 
nonlinear terms used in spectral methods, there- 
fore it is called pseudowavelet technique. The 


advantage of this scheme is that general nonlinear 
terms, for example, f(u)—(1 — u)e-*/*, can be 
treated more easily. The method can be summar- 
ized as follows: starting from the significant 
wavelet coefficients, |[4,| > e, one reconstructs u 
on a locally refined grid and gets u(x). Then one 
can evaluate f(u(x,)) pointwise and the wavelet 
coefficients f, are calculated using the adaptive 
decomposition. 


Finally, one computes the scalar products of the 
RHS of [21] with the test functions 0 to advance the 
solution in time. We compute 24, = (f,05) belonging 
to the enlarged coefficient set (white and gray 
regions in Figure 7). 

The algorithm is of O(N) complexity, where N 
denotes the number of wavelet coefficients retained 
in the computation. 


Application to 2D Turbulence 


To illustrate the above algorithm we present an 
adaptive wavelet computation of a vortex dipole in 
a square domain, impinging on a no-slip wall at 
Reynolds number Re= 1000. To take into account 
the solid wall, we use a volume penalization 
method, for which both the fluid flow and the 
solid container are modeled as a porous medium 
whose porosity tends towards zero in the fluid and 
towards infinity in the solid region. 

The 2D Navier-Stokes equations are thus mod- 
ified by adding the forcing term F — —(1/r)xv 
in eqn [18], where 7 is the penalization parameter 
and x is the characteristic function whose value is 1 
in the solid region and 0 elsewhere. The equations 
are solved using the adaptive wavelet method in 
a periodic square domain of size 1.1, in which 
the square container of size 1 is imbedded, 
taking 5— 10?. The maximal resolution corre- 
sponds to a fine grid of 1024^ points. Figure 8a 
shows snapshots of the vorticity field at times 
t—0.2,0.4,0.6, and 0.8 (in arbitrary units). We 
observe that the vortex dipole is moving towards 
the wall and that strong vorticity gradients are 
produced when the dipole hits the wall. The 
computational grid is dynamically adapted during 
the flow evolution, since the nonlinear wavelet filter 
automatically refines the grid in regions where 
strong gradients develop. Figure 8b shows the 
centers of the retained wavelet coefficients at 
corresponding times. 

Note that during the computation only 596 out of 
1024^ wavelet coefficients are used. The time 
evolution of total kinetic energy and the total 
enstrophy F = (5x0, are plotted in Figure 9 to 


Wavelets: Application to Turbulence 419 


Figure 8 Dipole wall interaction at Re = 1000. (a) Vorticity field, (b) corresponding centers of the active wavelets, at t — 0.2, 0.4, 0.6, 


and 0.8 (from top to bottom). 


show the production of enstrophy and the concomi- 
tant dissipation of energy when the vortex dipole 
hits the wall. 

This computation illustrates the fact that the 
adaptive wavelet method allows an automatic grid 
refinement, both in the boundary layers at the 
wall and also in shear layers which develop during 
the flow evolution far from the wall. Therewith, 
the number of grid points necessary for the 
computation is significantly reduced, and we con- 
jecture that the resulting compression rate will 
increase with the Reynolds number. 
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Figure 9 Time evolution of energy (solid line) and enstrophy 
(dashed line). 
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Introduction 


Wavelet analysis was first developed in the early 
1980s in the field of seismic signal analysis in the 
form of an integral transform with a localized kernel 
function with continuous parameters of dilation and 
translation. When a seismic wave or its derivative 
has a singular point, the integral transform has a 
scaling property with respect to the dilation para- 
meter; thus, this scaling behavior can be available to 
locate the singular point. In the mid-1980s, the 
orthonormal smooth wavelet was first constructed, 
and later the construction method was generalized 
and reformulated as multiresolution analysis 
(MRA). Since then, several kinds of wavelets have 
been proposed for various purposes, and the concept 
of wavelet has been extended to new types of basis 
functions. In this sense, the most important effect of 
wavelets may be that they have awakened deep 
interest in bases employed in data analysis and data 
processing. Wavelets are now widely used in various 
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fields of research; some of their applications are 
discussed in this article. 

From the perspective of time-frequency analysis, 
the wavelet analysis may be regarded as a windowed 
Fourier analysis with a variable window width, 
narrower for higher frequency. The wavelets can 
therefore give information on the local frequency 
structure of an event; they have been applied to 
various kinds of one-dimensional (1D) or multi- 
dimensional signals, for example, to identify an 
event or to denoise or to sharpen the signal. 

1D wavelets V^" (x) are defined as 


yer) (x) = Ta (= - *) 


where a(40),b are real parameters and v(x) is a 
spatially localized function called “analyzing wave- 
let” or “mother wavelet.” Wavelet analysis gives a 
decomposition of a function into a linear combina- 
tion of those wavelets, where a perfect reconstruc- 
tion requires the analyzing wavelet to satisfy some 
mathematical conditions. 

For the continuous wavelet transform (CWT), 
where the parameters (a,b) are continuous, the 


analyzing wavelet v(x)L^(R) has to satisfy the 
admissibility condition 


00 |,7 2 
Cs | tw)! dw « oo 


la] 


where vw) is the Fourier transform of w(x): 


dw) = f : 


The admissibility condition is known to be equiva- 
lent to the condition that (x) has no zero-frequency 
component, that is, v(0)—0, under some mild 
condition for the decay rate at infinity. Then the 
CWT and its inverse transform of a data function 


f(x) € L^(R) is defined as 


e " x) dx 


T,(a, b) = 


E i ac) x)f (x) dx 


f (x) - El. f Tla,bw fs?) (ay — — 


In the case of the discrete wavelet transform 
(DWT), the parameters (a,b) are taken discrete; a 
typical choice is a — 1/2/, b — k/2/, where j and k are 
integers: 


da db 
a? 


jala) = Up (2 — k) 

In order that the wavelets [v;,(x)|;j,k € Z} may 
constitute a complete orthonormal system in L*(R), 
the analyzing wavelet should satisfy more stringent 
conditions than the admissibility condition for the 
CWT, and is now constructed in the framework of 
MRA. A data function is then decomposed by the 
DWT as 


>> Or kA; R(X), ea]. Wi p(X 


j=- 


Even when the discrete wavelets do not constitute 
a complete orthnormal system, they often form a 
wavelet frame if linear combinations of the wavelets 
are dense in L? (R) and if there are two constants A, 
B such that the inequality 


AFI!” «A Via. f) < BIA? 


holds for an sius f(x) € L^(R). For the wavelet 
frame {wz}, there is a corresponding dual frame, 
(1; +), which permits the following expansion of f(x) 


f(x) = 3 ja. fija (x) = V (jas foa (x) 


j,k j,k 
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The wavelet frame is also employed in several 
applications. 

From the prospect of applications, the CWTs are 
better adapted for the analysis of data functions, 
including the detection of singularities and patterns, 
while the DWTs are adapted to the data processing, 
including signal compression or denoising. 


Singularity Detection and Multifractal 
Analysis of Functions 


Since its birth, the wavelet analysis has been applied 
for the detection of singularity of a data function. 
Let us define the Hölder exponent h(xo) at xo of a 
function f(x) is defined here as the largest value of 
the exponent / such that there exists a polynomial 
P,(x) of degree n that satisfies for x in the 
neighborhood of xo: 


|f (x) — Ps (x — x0)| = O(|x — xol) 

The data function is not differentiable if h(xp) < 1, 
but if híxo) > 1 then it is differentiable and a 
singularity may arise in its higher derivatives. The 
wavelet transform is applied to find the Hélder 
exponent h(xp), because T,,(a,b) has an asymptotic 
behavior T,(a, b) = O(ag^«*1/2)(a — 0) if the ana- 
lyzing wavelet has N(>h(xo)) vanishing moments, 
that is, 


meZ,0<m<N 


J x wx) dx = 0, 


oO 


A commonly used analyzing wavelet for this purpose 
may be the N-time derivative of the Gaussian 
function u(x) 2 d" (e-*/2) /dxN. This method works 
well to examine a single or some finite number of 
singular points of the data function. 

When the data function is a multifractal function 
with an infinite number of singular point of various 
strengths, the multifractal property of the data 
function is often characterized by the singularity 
spectrum D(h) which denotes the Hausdorff dimen- 
sion of the set of points where h(x)=h. The 
singularity spectrum is, however, difficult to obtain 
directly from the CWT, and the Legendre transfor- 
mation is introduced to bypass the difficulty. 

Fully developed 3D fluid turbulence may be a 
typical example of wavelet application to the 
singularity detection. The Kolmogorov similarity 
law of fluid turbulence for the longitudinal velocity 
increment Au(r) = e- (u(x + re) — u(x)), where u(x) 
is the velocity field and e is a constant unit vector, 
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predicts a scaling property of the structure function; 
for r in the inertial subrange, 
((Au(r))?) ~r”, Gy = p/3 

where (-) denotes the statistical mean. In reality, 
however, the scaling exponent ¢, measured in 
experiments shows a systematic deviation from p/3, 
which is considered to be a reflection of intermit- 
tency, namely the spatial nonuniformity or multi- 
fractal property of active vortical motions in 
turbulence. For simplicity, let us consider the 
velocity field on a linear section of the turbulence 
field. According to the multifractal formalism, the 
turbulence velocity field has singularities of various 
strengths described by the singularity spectrum 
D(h), which is related to the scaling exponent G, 
through the Legendre transform, D(h)= inf,(ph — 
Cp + 1). This relation is often used to determine D(h) 
from the knowledge of ¢, (structure function 
method). However, this method does not necessarily 
work well because, for example, it does not capture 
the singular points of the Holder exponent larger 
than 1 and it is unstable for h < 0. 

These difficulties are not restricted to the turbu- 
lence research, but arise commonly when the 
structure function is employed to determine the 
singularity spectrum. In these problems, the CWT 
T,(a,b) provides an alternative method. An inge- 
nious technique is to take only the modulus maxima 
of T,(a,b) (for each of fixed a) to construct a 
partition function 


q 
Zad => | sup 1.0.8] 
El (a,b) el 

where q € R, and Lmax denotes the set of all maxima 
lines, each of which is a continuous curve for small 
value of a, and there exists at least one maxima line 
toward a singular point of the Holder exponent 
b(xo) < N. In the limit of a— 0, defining the 
exponent 7(q) as Z(a,q) ~ a), one can obtain the 
singularity spectrum through the Legendre 
transform: 


D(h) = inf [q (^ +3) — 7(q)| 


This method (wavelet-transform modulus-maxima 
(WIMM) method) is advantageous in that it works 
also for singularities of h> 1 and h <0. Several 
simple examples of multifractal functions have been 
successfully analyzed by this method. For fluid 
turbulence, this method gives a singularity spectrum 
D(b) which has a peak value of ~1 at b~1/3, 
consistently with Kolmogorov similarity law, but 


has a convex shape around h=1/3 suggesting a 
multifractal property. For a fractal signal, we note 
that the WTMM method enlightens the hierarchical 
organization of the singularities, in the branching 
structure of the WT skeleton defined by the 
maxima lines arrangement in the (a, b) half-plane. 

Though the above discussion also applies to the 
DWT, the detection of the Holder exponent / in 
experimental situations is usually performed by the 
CWT, which has no restriction on possible values of 
a, while the DWT is often employed for theoretical 
discussions of singularity and multifractal structure 
of a function. 


Multiscale Analysis 


Wavelet transform expands a data function in the 
time-frequency or the position-wavenumber space, 
which has twice the dimension of the original signal, 
and makes it easier to perform a multiscale analysis 
and to identify events involved in the signal. In the 
wavelet transform, as stated above, the time resolu- 
tion is higher at higher frequency, in contrast with 
the windowed Fourier transform where the time and 
the frequency resolutions are independent of fre- 
quency. Another advantage of wavelet is a wide 
variety of analyzing wavelet, which enables us to 
optimize the wavelet according to the purpose of 
data analysis. Both the CWT and the DWT are 
available for these time-frequency or position- 
wavenumber analysis. However, the CWT has 
properties quite different from those of familiar 
orthonormal bases of discrete wavelets. 


Multidimensional CWT 


The CWT can be formulated in an abstract way. We 
can regard G — ((a, b) |a( Z0), b €R} as an affine 
group on R with the group operation of 
(a, b)(a', b')=(aa', ab' + b) associated with the 
invariant measure du — da db/a^. The group G has 
its unitary representation in the Hilbert space 
H = L?(R): 


(U(a, b)f)(x) = 


1 x—b 
al C) 
la | a 
and then we can consider the CWT can be constructed 


as a linear map W from L?(R) to L7(G; da db/a?): 


W : f(x) > Tla; b) = "T (U(a, b)w, f) 


y Cy 


where (-,-) is the inner product of L^(R) with the 
complex conjugate taken at the first element, and 


w(x) is a unit vector (analyzing wavelet) satisfying 
the abstract admissibility condition 


De / KU(a, bb, 4)? du < oo 
G 


This formulation is applicable also to a locally 
compact group G and its unitary and square 
integrable representation in a Hilbert space H. 
Note that even the canonical coherent states are 
included in this framework by taking the Weyl- 
Heisenberg group and L^(R) for G and H, 
respectively. This abstract formulation allows us 
to extend the CWT to higher-dimensional Eucli- 
dean spaces and other manifolds: for example, 2D 
sphere S* for geophysical application and 4D 
manifold of spacetime taking the Poincaré group 
into consideration. 

In R”, the CWT of f(x) € L'(R") and its inverse 
transform are given by 


Ty(a, r, b) = planb) (c) f (x) dx 


] 
v Cy JR" 


da dr db 


qn! 


T(a, r, bue P (x) 


f (x) 


"ve n 


where r € SO(n), b € R”, dr is the normalized invar- 
iant measure of e; M and the wavelets are 
defined as (^ ^ P (x) — (1/a"/*)y(r-!(x — b)/a), with 
the analyzing in. Mil id the admissibility 
condition 


Note that these wavelets are constructed not only 
by dilation and translation but also by rotation 
which therefore gives the possibility for directional 
pattern detection in a data function. In the case of 
2D sphere S^, on the other hand, the dilation 
operation should be reinterpreted in such a way 
that at the North Pole, for example, it is the normal 
dilation in the tangent plane followed by lifting it 
to S* by the stereographic projection from the 
South Pole. 

Generally, the abstract map W thus defined is 
injective and therefore reversal, but not surjective in 
contrast with the Fourier case. Actually in the case of 
1D CWT, T.;,(a, b) is subject to an integral condition: 


Tw(a, b) =[ L. 


Kia byat b') < / BY (ae) ysl") Gc) dx 


da db 
2 


K(a, b; a’, b')Ty(a', b^) 
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which defines the range of the CWT, a subspace 
of L?*(R). Therefore, if one wants to modify T.;(a, b) 
by, for example, assigning its value as zero in some 
parameter region just as in a filter process, care 
should be taken for the resultant T,,(a, b) to be in the 
image of the CWT. The reason may be understood 
intuitively by noticing that the wavelets y (x) are 
linearly dependent on each other. The expression of 
a data function by a linear combination of the 
wavelets is therefore not unique, and thus is 
redundant. The CWT gives only T,(a, b) of the 
least norm in L?(R*; dadb/a?). In physical inter- 
pretations of the CWT, however, this nonuniqueness 
is often ignored. 


Pattern Detection 


Edge detection The edges of an object are often the 
most important components for pattern detection. 
The edge may be considered to consist of points of 
sharp transition of image intensity. At the edge, the 
modulus of the gradient of the image f(x, y) is 
expected to take a local maximum in the 1D 
direction perpendicular to the edge. Therefore, the 
local maxima of |Vf(x, y)| may be the indicator of 
the edge. However, the image textures can also give 
similar sharp transitions of f(x, y), and one should 
take into account the scale dependence which 
distinguishes between edges and textures. One of 
the practically possible ways for this purpose is to 
use dyadic wavelets i" (x, y) =2/41"(2/x, 2/y) which 
are generated from che two wavelets (ul, ist 
00/0x, —00/0y), where 0 is a localized function 
(multiscale edge detection method). The dyadic 
wavelet transform of the image f(x, y) 


T? (bi, b2) = (f(x,y), f(x — b1,y — b2)), 


uz L2 
defines the multiscale edges as a set of points 
b — (b4,b5) where the modulus of the wavelet trans- 
form, T7, T?)|, takes a locally maximum value 
(WIMM) in a 1D neighborhood of b in the 
direction of (T (b), T?(b)). Scale dependence of 
the magnitude of the modulus maxima is related to 
the Holder exponent of f (x, y) similarly to 1D case, 
and thus gives information to distinguish between 
the edges and the textures. 

Inversely, the information of WIMM  bj;,= 
((b1;p, b2;5)) of multiscale edges can be made use 
of for an approximate reconstruction of the original 
image, although the perfect reconstruction cannot be 
expected because of the noncompleteness of the 
modulus maxima wavelets. Assuming that 
(v Dior ip) = {Wi (x — bis), Y; (x — bj, )) constitutes a 
frame of the linear closed space generated by 


424 Wavelets: Applications 


(UN V, an approximate image f is obtained by 
inverting the relation 


Lf 3 > FAS = Y Tr A 
m jp m jp 


using, for example, a conjugate gradient algorithm, 
where a fast calculation is possible with a filter bank 
algorithm for the dyadic wavelet (“algorithm à 
trous"). This algorithm gives only the solution of 
minimum norm among all possible solutions, but it 
is often satisfactory for practical purposes and thus 
is applicable also to data compression. 


Directional detection For oriented features such as 
segments or edges in images to be detected, a 
directionally selective wavelet for the CWT is desired. 
A useful wavelet for this purpose is one that has the 
effective support of its Fourier transform in a convex 
cone with apex at the origin in wave number space. A 
typical example of the directional wavelet may be the 
2D Morlet wavelet: 


w(x) = expliko - x) exp(—|Ax|”) 


where ko is the center of the support in Fourier 
space, and A is a 2 x 2 matrix diagle!/?, 1](e < 1), 
where the admissibility condition for the CWT is 
approximately satisfied for |ko| > 5. Another exam- 
ple is the Cauchy wavelet which has the support 
strictly in a convex cone in wave number space. 

These wavelets have the directional selectivity 
with preference to a slender object in a specific 
direction. One of their applications is the analysis of 
the velocity field of fluid motion from an experi- 
mental data, where many tiny plastic balls distrib- 
uted in fluid give a lot of line segments in a picture 
taken with a short exposure. The directional wavelet 
analysis of the picture classifies the line segments 
according to their directions, indicating the direc- 
tions of fluid velocity. Another example may be a 
wave-field analysis where many waves in different 
directions are superimposed; the directional wavelets 
allow one to decompose the wave field into the 
component waves. Directional wavelets have also 
been applied successfully to detect symmetry of 
objects such as crystals or quasicrystals. 


Denoising and separation of signals The wavelet 
frame as well as the CWT give a redundant 
representation of a data function. If, instead of the 
original data, the redundant expression is trans- 
mitted, the redundancy is used to reduce the noise 
included in the received data because the redun- 
dancy requires the data to belong to a subspace, and 
the projection of the received data to the subspace 


reduces the noise component orthogonal to it. More 
specifically, the wavelet frame gives a representation 
of a data function as f (t) — doin Qj. kj es where the 
expansion coefficients o, = (1; 4, f(x)) satisfy the 
defining equation of the subspace 


Ay = y Ak (Dir pr, Pik) 


If the frame coefficients are transmitted, the projec- 
tion operator P, which is defined on the right-hand 
side of the above equation, reduces the noise in the 
received coefficients a;, contaminated during the 
transmission. 

However, this method is not applicable if the 
transmitted signal is not redundant. Then some 
a priori criterion is necessary to discriminate between 
signal and noise. Various criteria have been pro- 
posed in different fields. If the signal and the noise, 
or plural signals have different power-law forms of 
spectra, then their discrimination may be possible by 
the DWT at higher-frequency region where the 
difference in the magnitude of the coefficients is 
significant. In this approach, the wavelets of Meyer 
type, that is, an orthogonal wavelet with a compact 
support in Fourier space, may be preferable because 
the wavelets of different scales are separated, at least 
to some extent, in Fourier space. 

In fluid dynamics, the vorticity field of 2D 
turbulence is found to be decomposed into coherent 
and incoherent vorticity fields, according as the 
CWT is larger than a threshold value or not, 
respectively. These two fields give different Fourier 
spectra of the velocity field (k^? for coherent part 
while k^? for incoherent part), showing that the 
coherent structures are responsible for the deviation 
from k^ predicted by the classical enstrophy 
cascade theory. In an astronomical application, on 
the other hand, the data processing is performed by 
a more sophisticated method taking into account 
interscale relation in the wavelet transform, because 
an astronomical image contains various kinds 
of objects, including stars, double-stars, galaxies, 
nebulas, and clusters. In a medical image however 
contrast analysis is indispensable for diagnostic 
imaging to get a clear detailed picture of organic 
structure. A scale-dependent local contrast is defined 
as the ratio of the CWT to that given by an 
analyzing wavelet with a larger support. A multi- 
plicative scheme to improve the contrast is con- 
structed by using the local contrast. 


Signal Compression 


Signal compression is quite an important technology 
in digital communication. Speech, audio, image, and 
digital video are all important fields of signal 


compression, and plenty of compression methods 
have been put to practical use, but we mention here 
only a few. 

The MRA for orthogonal wavelets gives a 
successive procedure to decompose a subspace of 
L^(R) into a direct sum of two subspaces corre- 
sponding to higher- and lower-frequency parts; only 
the latter of which is decomposed again into its 
higher- and lower-frequency parts. Algebraically, 
this procedure was already known before the 
discovery of MRA in filter theory in electrical 
engineering, where a discretely sampled signal is 
convoluted with a filter series to give, for example, a 
high-pass-filtered or low-pass-filtered series. An 
appropriate designed pair of a high-pass and a 
low-pass filters followed by the downsampling 
yields two new series corresponding to the higher- 
and lower-frequency parts, respectively, which are 
then reversible by another two reconstruction filters 
with the upsampling. These four filters which are 
often employed in a widely used technique of “sub- 
band coding" then constitute a perfect reconstruc- 
tion filter bank. Under some conditions, successive 
applications of this decomposition process to the 
series of lower-frequency parts, which is equivalent 
to the nesting structure of MRA, have been used for 
data compression (quadrature mirror filter). A 
famous example is a data compression system of 
FBI for finger prints, consisting of wavelet coding 
with scalar quantization. 

In MRA, however, it is only the lower-frequency 
parts that are successively decomposed. If both the 
lower- and the higher-frequency parts are repeatedly 
decomposed by the decomposition filters, then the 
successive convolution processes correspond to a 
decomposition of data function by a set of wavelet- 
like functions, called *wavelet packet," where there 
are choices whether to decompose the higher- and/or 
the lower-frequency parts. The best wavelet packet, in 
the sense of the entropy, for example, within a 
specified number of decompositions, often provides 
with a powerful tool for data compression in several 
areas, including speech analysis and image analysis. 
We also note that from the viewpoint of the best basis 
which minimizes the statistical mean square error of 
the thresholded coefficients, an orthonormal wavelet 
basis gives a good concentration of the energy if the 
original signal is a piecewise smooth function super- 
imposed by a white noise, which is thus efficiently 
removed by thresholding the coefficients. The effi- 
ciency of a wavelet expansion of a signal is sometimes 
evaluated with the entropy of *probability" defined as 
lo; /|fll . A better wavelet can be selected by 
reducing the entropy, practically from among some 
set of wavelets, and its restricted expansion coefficients 
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give a compressed signal. One of the systematic 
methods to generate such a suitable basis is also to 
employ the wavelet packets. 


Numerical Calculation 


Application of wavelet transform, especially of the 
DWT, to numerical solver for a differential equation 
(DE) has long been studied. At the first sight, the 
wavelets appear to give a good DE solver because 
the wavelet expansion is generally quite efficient 
compared to Fourier series due to its spatial 
localization. But its implementation to an efficient 
computer code is not so straightforward; research is 
still continuing for concrete problems. Application 
of the CWT to spectral method for partial differ- 
ential equation (PDE) has been studied extensively. 
There is no wavelet which diagonalizes the differ- 
ential operator 0/0x; therefore, an efficient numer- 
ical method is necessary for derivatives of wavelets. 
Products of wavelets also yield another numerical 
problem. MRA brings about mesh points which are 
adaptive to some extent, but finite element method 
still gives more flexible mesh points. 

For some scaling-invariant differential or integral 
operators, including 07/0x*, Abel transformations, 
and Reisz potential, adaptive biorthogonal wavelets 
can be provided with block-diagonal Galerkin 
representations, which has been applied to data 
processing. Generally, simultaneous localization of 
wavelets, both in space and in scale, leads to a 
sparse Galerkin representation for many pseudodif- 
ferential operators and their inverses. A threshold- 
ing technique with DWT has been introduced to 
coherent vortex simulation of the 2D Navier-Stokes 
equations, to reduce the relevant wavelet coeffi- 
cients. Another promising application of wavelet 
occurs as a preprocessor for an iterative Poisson 
solver, where a wavelet-based preconditioning leads 
to a matrix with a bounded condition number. 


Other Wavelets and Generalizations 


Several new types of wavelets have been proposed: 
“coiflet” whose scaling function has vanishing 
moments giving expansion coefficients approxi- 
mately equal to values of the data functions, and 
“symlet” which is an orthonormal wavelet with a 
nearly symmetric profile. Multiwavelets are wavelets 
which give a complete orthonormal system in L? 
space. In 2D or multidimensional applications of the 
DWT, separable orthonormal wavelets consisting of 
tensor products of 1D orthonormal wavelets are 
frequently used, while nonseparable orthonormal 
wavelets are also available. Another generalization 


426 Wavelets: Mathematical Theory 


of wavelets is the Malvar basis which is also a 
generalization of local Fourier basis, and gives a 
perfect reconstruction. A new direction of wavelet is 
the second-generation wavelets which are con- 
structed by lifting scheme and free from the regular 
dyadic procedure, and thus applicable to compact 
regions as $? and a finite interval. 


See also: Fractal Dimensions in Dynamics; Image 
Processing: Mathematics; Intermittency in Turbulence; 
Wavelets: Application to Turbulence; Wavelets: 
Mathematical Theory. 
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Introduction 


The wavelet transform unfolds functions into time 
(or space) and scale, and possibly directions. The 
continuous wavelet transform has been discovered 
by Alex Grossmann and Jean Morlet who published 
the first paper on wavelets in 1984. This mathema- 
tical technique, based on group theory and square- 
integrable representations, allows us to decompose a 
signal, or a field, into both space and scale, and 
possibly directions. The orthogonal wavelet trans- 
form has been discovered by Lemarié and Meyer 
(1986). Then, Daubechies (1988) found orthogonal 
bases made of compactly supported wavelets, and 
Mallat (1989) designed the fast wavelet transform 
(FWT) algorithm. Further developments were done 
in 1991 by Raffy Coifman, Yves Meyer, and Victor 
Wickerhauser who introduced wavelet packets and 
applied them to data compression. The development 
of wavelets has been interdisciplinary, with con- 
tributions coming from very different fields such as 
engineering (sub-band coding, quadrature mirror 
filters, time-frequency analysis), theoretical physics 
(coherent states of affine groups in quantum 
mechanics), and mathematics (Calderon-Zygmund 
operators, characterization of function spaces, har- 
monic analysis). Many reference textbooks are 
available, some of them we recommend are listed 
in the “Further reading” section. Meanwhile, a large 
spectrum of applications has grown and is still 
developing, ranging from signal analysis and image 
processing via numerical analysis and turbulence 
modeling to data compression. 


Further Reading 


Benedetto JJ and Frazier W (eds.) (1994) Wavelets: Mathematics 
and Applications. Boca Raton, FL: CRC Press. 
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Philadelphia. 

Mallat S (1998) A Wavelet Tour of Signal Processing. San Diego: 
Academic Press. 

Strang G and Nguyen T (1997) Wavelet and Filter Banks. 
Wellesley: Wellesley-Cambridge Press. 


In this article, we will first define the continuous 
wavelet transform and then the orthogonal wavelet 
transform based on a multiresolution analysis. 
Properties of both transforms will be discussed 
and illustrated by examples. For a general intro- 
duction to wavelets, see Wavelets: Applications. 


Continuous Wavelet Transform 


Let us consider the Hilbert space of square-integr- 
able functions L^(R)- [f :||f||; < co}, equipped 
with the scalar product (f,g)= |, f(x)g*(x) dx 
(* denotes the complex conjugate in the case of 
complex-valued functions) and where the norm is 


defined by |f], = (f, f) ^". 
Analyzing Wavelet 


The starting point for the wavelet transform is to 
choose a real- or complex-valued function y € 
L^(R), called the *mother wavelet," which fulfills 
the admissibility condition, 


ie f DO p « oo (1) 


where 
$t = | - Vx) e Pr dx 2] 


denotes the Fourier transform, with ¿= V-1 and k 
the wave number. If y is integrable, that is, € 
L*(R), this implies that 4 has zero mean, 


NT Or 


In practice, however, one also requires the wavelet 
i» to be well localized in both physical and Fourier 
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spaces, the latter implying smoothness, and to have 
M vanishing moments, 


/ | x"(x)dx = 0 form=0,M-1 [4] 


DO 


that is, monomials up to degree M — 1 are exactly 
reproduced. In Fourier space, this property is 
equivalent to 


nm 


dps Y 


therefore, the Fourier transform of w decays 
smoothly at k=0. 


703 lk=o=0 form -0,M-1 [5] 


Analysis 


From the mother wavelet v», we generate a family of 
continuously translated and dilated wavelets, 


1 x—b 
Wa.b (x) zi ^" 4 ) 
fora>0andbekR [3 


where a denotes the dilation parameter, correspond- 
ing to the width of the wavelet support, and b the 
translation parameter, corresponding to the position 
of the wavelet. The wavelets are normalized in 
energy norm, that is, ||v; pl, — 1. 

In Fourier space, eqn [6] reads 


Daba) = vau(ak) e 


where the contraction with 1/a in [6] is reflected in 
a dilation by a [7] and the translation by b implies a 
rotation in the complex plane. 

The continuous wavelet transform of a function f 
is then defined as the convolution of f with the 
wavelet family pa p: 


f(a,b) = J. - f GOV (x) da 8] 


where 1” , denotes, in the case of complex-valued 
wavelets, Che complex conjugate. 
Using Parseval’s identity, we get 


f(a.b)= | Pplk) dk p 


and the wavelet transform could be interpreted as a 
frequency decomposition using bandpass filters Da, b 
centered at frequencies k=k,,/a. The wave number 
k,, denotes the barycenter of the wavelet support in 
Fourier space 


—(2nkb [7] 


ky, = Jo Riv(R)| de [10] 


SE |i(R)| dk 


Note that these filters have a variable width Ak/k; 
therefore, when the wave number increases, the 
bandwidth becomes wider. 


Synthesis 


The admissibility condition [1] implies the existence 
of a finite energy reproducing kernel, which is a 
necessary condition for being able to reconstruct the 
function f from its wavelet coefficients f. One then 
recovers 


fix) = cl S Fa biast yar at 


which is the inverse wavelet transform. 

The wavelet transform is an isometry and one has 
Parseval’s identity. Therefore, the wavelet transform 
conserves the inner product and we obtain 


n=] dd 


1 ffon dadb 
-g| | Tezen = na 


As a consequence, the total energy E of a signal 
can be calculated either in physical space or in 
wavelet space, such as 


E = E fee as 
This formula is also the starting point for the 


definition of wavelet spectra and scalogram (see 
Wavelets: Application to Turbulence). 


y? not 


[13] 


Examples 


In the following, we apply the continuous wavelet 
transform to different academic signals using the 
Morlet wavelet. The Morlet wavelet is complex 
valued, and consists of a modulated Gaussian with 


width ko /7: 


p(x) = 


The envelope factor ko controls the number of 
oscillations in the wave packet; Epica, ko — 5 is 
used. The correction factor e-*/?, to ensure its 
vanishing mean, is very small and often neglected. 
The Fourier transform is 


(7 = e 5/2) e 77 xh [14] 


d k 2 2 2 
WE) = Fee WINE -1) AS 
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Figure 1 shows wavelet analyses of a cosine, two 
sines, a Dirac, and a characteristic function. Below 
the four signals we plot the modulus and the phase 
of the corresponding wavelet coefficients. 


Higher Dimensions 


The continuous wavelet transform can be extended to 
higher dimensions in L?(R") in different ways. Either 
we define spherically symmetric wavelets by setting 
u(x) = v4 (Ix|) for x € R” or we introduce in addition 
to dilations a € R^ and translations b € R” also rota- 
tions to define wavelets with a directional sensitivity. In 
the two-dimensional case, we obtain for example, 


Wabo(x) = (s € s >) [16] 


Il 
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where a € R*,b € R?, and where Rọ is the rotation 


matrix 
cosÜ —sin0 
tr cos 0 [17] 
The analysis formula [8] then becomes 
f (a, b, 8) f (x), y o (x) dx [18] 


R? 


and for the corresponding inverse wavelet transform 
[11] we obtain 


p Je om dadbdé 
=< | Lf i f (a,b,0)v, v.s (x x) a [19] 


Similar constructions can be made in dimensions 
larger than 2 using n— 1 angles of rotation. 
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Figure 1 Examples of a one-dimensional continuous wavelet analysis using the complex-valued Morlet wavelet. Each subfigure 
shows on the top the function to be analyzed and below (left) the modulus of its wavelet coefficients and below (right) the phase of its 
wavelet coefficients. 


Discrete Wavelets 
Frames 


It is possible to obtain a discrete set of quasiortho- 
gonal wavelets by sampling the scale and position 
axes a,b. For the scale a we use a logarithmic 
discretization: a is replaced by a; =a’, where do is 
the sampling rate of the log a axis (ao — A(loga)) 
and where j € Z is the scale index. The position b is 
discretized linearly: b is replaced by xj —iboaj, 
where by is the sampling rate of the position axis at 
the largest scale and where ; € 7 is the position 
index. Note that the sampling rate of the position 
varies with scale, that is, for finer scales (increasing 7 
and hence decreasing aj), the sampling rate 
increases. Accordingly, we obtain the discrete wave- 
lets (cf. Figure 2) 


byl!) = aj V y -A ) 20] 
] 


and the corresponding discrete decomposition for- 
mula is 


h= (ist) = f fewer pa 


Furthermore, the wavelet coefficients satisfy the 
following estimate: 


AI < Y Ifa? < BI 22] 


with frame bounds B > A > 0. In the case A — B we 
have a tight frame. 


(b) 
Figure 2 Orthogonal quintic spline wavelets w;;(x) — 2/4 
(2x-—i) at different scales and positions: (a) 1/5 6(x), 
e, a2 (X), V'z, 108 (X), and (b) corresponding wavelet coefficients. 
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The discrete reconstruction formula is 


fey = C V^ Y Fidel) + R@) 13] 


j= Í=—0o0 


where C is a constant and R(x) is a residual, both 
depending on the choice of the wavelet and the 
sampling of the scale and position axes. For the parti- 
cular choice a9 — 2 (which corresponds to a scale 
sampling by octaves) and bo = 1, we have the dyadic 
sampling, for which there exist special wavelets v; that 
form an orthonormal basis of L^(R), that is, such that 


(oa, yy) = Oy ôi [24] 


where 6 denotes the Kronecker symbol. This means 
that the wavelets j; are orthogonal with respect to 
their translates by discrete steps 2; and their dilates 
by discrete steps 277 corresponding to octaves. In 
this case, the reconstruction formula is exact with 
C=1 and R=0. Note that the discrete wavelet 
transform has lost the invariance by translation and 
dilation of the continuous one. 


Orthogonal Wavelets and Multiresolution Analysis 


The construction of orthogonal wavelet bases and the 
associated fast numerical algorithm is based on the 
mathematical concept of multiresolution analysis 
(MRA). The underlying idea is to consider approx- 
imations f; of the function f at different scales j. 
The amount of information needed to go from a coarse 
approximation /; to a finer resolution approximation 
f; 41 is then described using orthogonal wavelets. The 
orthogonal wavelet analysis can thus be interpreted as 
decomposing the function into approximations of the 
function at coarser and coarser scales (i.e., for 
decreasing j), where the differences between the 
approximations are encoded using wavelets. 

The definition of the MRA was introduced by 
Stéphane Mallat in 1988 (Mallat 1989). This 
technique constitutes a mathematical framework of 
orthogonal wavelets and the related FWT. 

A one-dimensional orthogonal MRA of L^(R) is 
defined as a sequence of successive approximation 
spaces V;,j € Z, which are closed imbedded subspaces 
of L?(R). They verify the following conditions: 


Vj - Vii WEL [25] 
UV; = LR) [26] 
JEZ 
PV, = {0} [27] 
jez 
f(x) € V; & f (2x) € Via [28] 
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A scaling function ó(x) is required to exist. Its 
translates generate a basis in each Vj, that is, 


V, Vj = Span{ ji} ¡ez, [29] 
where 
G(x) = VP 9()x i), jieZ [30] 


At a given scale j, this basis is orthonormal with respect 
to its translates by steps ¿/2/ but not to its dilates, 


(Qjis Pik) = Sik [31] 


The nestedness of the approximation spaces [28] 
generated by the scaling function ó implies that it 
satisfies a refinement equation: 


bj1i(x) = >> bucaidis(x) [32] 
with the filter coefficients hy = (jn, 6; 1,0), which 
determine the scaling function completely. In gen- 
eral, only the filter coefficients hb, are known and no 
analytical expression of ó is given. Equation [32] 
implies that the approximation of a function at 
coarser scale can be described by linear combina- 
tions of the same function at finer scales. 

The orthogonal projection of a function f € L^(R) 
on Vj is defined as 


Py, f — Pf =f [33] 
with 
fix) = 5 (f. dj) óje(x) [34] 
keZ 


This coarse graining at a given scale J is done by 
filtering the function with the scaling function @. As 
a filter, the scaling function ó does not have 
vanishing mean but is normalized so that 
JE 9 (x) dx — 1. 

As Vj, is included in Vj, we can define its 
orthogonal complement space in Vj: 


Vj = Vi. 16 Wy-1 [35] 


Correspondingly, the approximation of the func- 
tion f at scale 27, belonging to Vj, can be 
decomposed as a sum of orthogonal projections on 
Vj. 1 and W)_1, such that 


Py f 23 Pur Py, |f [36] 


Based on the scaling function ó, one can construct a 
function 4, the so-called mother wavelet, given by 
the relation 


Vilx) = 5 gn-21bjm(x) [37] 


neZ 


with g,= (din, Vj-1,0), and where wj(x)- 2/7 
W(x — i), j,i € Z (cf. Figure 2). The filter coeffi- 
cients g, can be computed from the filter coefficients 
h, using the relation 


En = (-1)' "hin [38] 


The translates and dilates of the wavelet y 
constitute orthonormal bases of the spaces W;, 


W; = span{ ii) ¡ez [39] 


As in the continuous case, the wavelets have 
vanishing mean, and also possibly vanishing higher- 
order moments; therefore, 


/ a lde=0 fotmz0,..., M-1 [40] 


Let us now consider approximations of a function 
f € L^(R) at two different scales j: 


e at scale j 
f(x) = Y f, dio) 41] 
e at scale j— 1 
fin) = de ONT 42 


with the scaling coefficients 


f; = (f, Pji) [43] 


which correspond to local averages of the function 
f at position ¿27 and at scale 27. 

The difference between the two approximations is 
encoded by the wavelets 


fc)-fic)- Y Fai) — D] 


1—— 00 


with the wavelet coefficients 


fs = (fabi) [45] 


which correspond to local differences of the function 
at position (2i 4- 1)2~"*") between approximations 
at scales 27 and 2 U*', 

Iterating the two-scale decomposition [44], any 
function f € L?(R) can be expressed as a sum of a 
coarse-scale approximation at a reference scale jo 
that we set to 0 here, and their successive 


differences. These details are needed to go from one 


scale j to the next finer scale j+1 for 
{=0) 12.5) — 1, 
f(x) = Y foidoi(x) + 37 Y; fix) [46] 
i=-0O j=0 1=—oo 


For numerical applications, the sums in eqn [46] 
have to be truncated in both scale j and position i. 
The truncation in scale corresponds to a limitation 
of f to a given finest scale /, which is in practice 
imposed by the available sampling rate. Due to the 
finite length of the available data, the sum over ; 
also becomes finite. The decomposition [46] is 
orthogonal, as, by construction, 


(Viii, yy) = Oy ii [47] 
(wi, Gu) =O forj >f [48] 


in addition to [31]. 


Fast Wavelet Transform 


Starting with a function f € L*(R) given at the finest 
resolution 27 (i.e., we know f; € V; and hence the 
coefficients fj; for ; € Z), the FWT computes its 
wavelet coefficients fj; by decomposing successively 
each approximation f; into a coarser scale approx- 
imation f;.;, plus the corresponding details which 
are encoded by the wavelet coefficients. The 
algorithm uses a cascade of discrete convolutions 
with the low pass filter h,, and the bandpass filter g,,, 
followed by downsampling, in which only one 
coefficient out of two is retained. The direct wavelet 
transform algorithm is 


e initialization 


i 


given f € L? (R) and fji = 10) for i € Z 


e decomposition 
for j=] to 1, step —1, do 


fiis -— * MET in [49] 


nc 7. 


fiii — » 8-2 in [50] 


neZz, 


The inverse wavelet transform is based on 
successive reconstructions of fine-scale approxima- 
tions f; from coarser scale approximations fj-1, 
plus the differences between approximations at 
scale ¡— 1 and the finer scale j which are encoded 
by f;-1,;. The algorithm uses a cascade of discrete 
convolutions with the filters h, and g,, preceded by 
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upsampling which adds zeros in between two 
successive coefficients. 


® reconstruction 
for j — 1 to J, step 1, do 


fy y bi auf; as Y Sian [51] 


n——oo n= DU 


The FWT has been introduced by Stéphane Mallat 
in 1989. If the scaling functions (and wavelets) are 
compactly supported, the filters 4, and g, have only 
a finite number of nonvanishing coefficients. In this 
case, the numerical complexity of the FWT is O(N) 
where N denotes the number of samples. 


Choice of Wavelets 


Orthogonal wavelets are typically defined by their 
filter coefficients b,, since in general no analytic 
expression for w is available. In the following, we 
give the filter coefficients of b, for some typical 
orthogonal wavelets. The filter coefficients of g, can 
be obtained using the quadrature relation between 
the two filters [38]. 


e Haar D1 (one vanishing moment): 


ho = 1/V2 
pb; =1/V2 


e Daubechies D2 (two vanishing moments): 


ho = 0.482 962 913 145 
hı = 0.836 516 303 736 
hy = 0.224 143 868 042 
h, = —0.129 409 522 551 


e Daubechies D3 (three vanishing moments): 


ho = 0.332 670 552 950 
hı = 0.806 891 509 311 
hy = 0.459 877 502118 
h, = —0.135 011020010 
h4 = —0.085 441 273 882 
hs = 0.035 226 291 882 
e Coiflets C12 (four vanishing moments): the 


wavelets and the corresponding scaling function 
are shown in Figure 3. 


Remarks The construction of orthogonal wavelets 
in L^(R) can be modified to obtain wavelets on the 
interval, that is, in L?([0, 1]). Therewith, boundary 
wavelets are introduced, while in the interior of the 
interval the wavelets are not modified. 
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Figure 3 Orthogonal wavelets Coiflet C12. (a) Scaling function ¿(x) (left) and |ó(w)!|. (b) Wavelet u(x) (left) and |u)(w)|. 


A periodic MRA of L?(T), where T—R/Z 
denotes the torus, can also be constructed by 
periodizing the wavelets in L^(R), using 


UP" (x) = Y y(x +k) 


ke 


Relaxing the condition of orthogonality allows 
greater flexibility in the choice of the basis 
functions. For example, biorthogonal wavelets can 
be designed using different basis functions for 
analysis (è?) and synthesis (5) which are related 
but no longer orthogonal. A couple of refinable 
scaling functions (¢*,¢°) with related wavelets 
(1,4%) which are by construction biorthogonal 
generate a biorthogonal MRA V?,V?. From an 
algorithmic point of view, only two different filter 
couples (g?^, 5b?) for the forward and (g^,/^) for the 
backward FWT are used, without changing the 
algorithm. 

The multiresolution approach can be further 
generalized, for samplings on  nonequidistant 
grids leading to the so-called second-generation 
wavelets. 


Higher Dimensions 


The previously presented one-dimensional construc- 
tion can be extended to higher dimensions. For 
simplicity, we will consider only the two- 
dimensional case, since higher dimensions can be 
treated analogously. 


Tensor product construction Having developed a 
one-dimensional orthonormal basis v; of L^(R), one 
could use these functions as building blocks in 
higher dimensions. One way of doing so is to take 
the tensor product of two one-dimensional bases 
and to define 


Vi, i isi (Xs Y) = Visi, (%) yj, i, (y) [52] 


The resulting functions constitue an orthonormal 
wavelet basis for L?(R?). Each function f € L?(R?) 
can then be developed into 


f (x, y) = kv $ PERN [UA bs E y) [53] 
Ixste Iysty 


with fj... ic, iy = (fs Vj, i, i.i). However, in this basis 
the two variables x and y are dilatated separately 
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(a) 


Figure 4a Schematic representation of the 2D (b) wavelet transforms: (a) Tensor product construction and (b) 2D MRA. 


and therefore no longer form an MRA. This means 
that the functions Wj, involve two scales, 2/* and 
2/, and each of the functions is essentially supported 
on a rectangle with these side-lengths. Hence, the 
decomposition is often called rectangular wavelet 
decomposition (cf. Figure 4a). From the algorithmic 
viewpoint, this is equivalent to applying the one- 
dimensional wavelet transform to the rows and the 
columns of a matrix or a function. For some 
applications, such a basis is advantageous, for others 
not. Often the notion of a scale has a certain 
meaning. For an application, one would like to have 
a unique scale assigned to each basis function. 


Multiresolution construction Another much more 
interesting construction is the construction of a truly 
two-dimensional MRA of L?(R?). It can be obtained 
through the tensor product of two one-dimensional 
MRAs of L*(R). More precisely, one defines the 
spaces V;,j € Z by 


and V;-spaníó; i i (x, y) =p; i (*) j,i, (y), ix, i € Z) 
fulfilling analogous properties as in the one- 
dimensional case. 

Likewise, we define the complement space W; to 
be the orthogonal complement of V; in V;,, that is, 


Vier = Vier Via 


= (V; 6 W;) & (V; O W;) [55] 
=V; ES) V pD ((W; ES) V;) 

® (V; W;) 6 (W; e W;)) [56] 
—V;oW, [57] 


It follows that the orthogonal complement W; = 
Vj,4 OV; consists of three different types of func- 
tions and is generated by three different wavelets 


(b) 


Vie (x) ly), E=] 
Wi ii (XY) = 4 eji (xul, €=2 [58] 
Wii lA Vii (Y), €=3 


Observe that here the scale parameter j simulta- 
neously controls the dilatation in x and y. We recall 
that in d dimensions this construction yields 27 — 1 
types of wavelets spanning Wj. 

Using [58], each [auction f € L'(R?) can be 
developed into a multiresolution basis as 


)-2.2. A j icis Piei OY) [59] 


Ix ly. E=1,2,3 


with f. 4,7 €f, V; i ¡> A schematic representa- 
tion of the wavelet coefficients is shown in 
Figure 4b. The algorithmic structure of the one- 
dimensional transforms carries over to the two- 
dimensional case by simple tensorization, that is, 
applying the filters at each decomposition step to 


rows and columns. 


Remark The described two-dimensional wavelets 
and scaling functions are separable. This advantage is 
the ease of generation starting from one- 
dimensional MRAs. However, the main drawback 
of this construction is that three wavelets are needed 
to span the orthogonal complement space Wj. 
Another property should be mentioned. By construc- 
tion, the wavelets are anisotropic, that is, horizontal, 
diagonal, and vertical directions are preferred. 


Approximation Properties 
Reproduction of Polynomials 


A fundamental property of the MRA is the exact 
reproduction of polynomials. The vanishing 
moments of the wavelet v, that is, fp x"(x)dx — 0 
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for m=0,M-— 1, is equivalent to the fact that 
polynomials up to degree M — 1, can be expressed 
exactly as a linear combination of scaling functions, 
DPn(x) 2, cz, "!" (x —n) for m=0,M-—1. This so- 
called Strang-Fix condition proves that y has M 
vanishing moments if and only if any polynomial of 
degree M — 1 can be written as a linear combination 
of scaling functions ó. Note that, as P, Z L^(R), the 
coefficients n” are not in I^ (Z.). 


Regularity and Local Decay of Wavelet 
Coefficients 


The local or global regularity of a function is closely 
related to the decay of its wavelet coefficients. If a 
function is locally in C*(IR) (the space of s-times 
continuously differentiable functions), it can be well 
approximated locally by a Taylor series of degree s. 
Consequently, its wavelet coefficients are small at 
fine scales, as long as the wavelet yw has enough 
vanishing moments. The decay of the coefficients 
hence determines directly the error being made when 
truncating a wavelet sum at some scale. 

Depending on the type of norm used and whether 
global or local characterization is concerned, various 
relations of this kind have been developed. Let us 
take as example the case of an a-Lipschitz function. 

Suppose f € L?(R), then for [a,b] C R the func- 
tion f is a-Lipschitz with 0 < a < 1 for any xo € 
[a,b], that is, |f(xo +h) — f(xo)| € Clh|", if and 
only if there exists a constant A such that |f;| € 
A2-i^-V? for any (j,i) with i/2/ € [a,b]. 

This shows the relation between the local reg- 
ularity of a function and the decay of its wavelet 
coefficients in scale. 


Example To illustrate the local decay of the 
wavelet coefficients, we consider in Figure 5 the 
function f(x)= sin(2zx) for x € 1/4 and x > 3/4 
and f(x) = —sin (2zx) for 1/4 < x < 3/4. The corre- 
sponding wavelet coefficients for quintic spline 
wavelets are plotted in logarithmic scale. The 
wavelet coefficients show that only in a local region 
around singularities the fine-scale coefficients are 
significant. 


Linear Approximation 


The exact reproduction of polynomials can be used 
to derive error estimates for the approximation of a 
function f at a given scale, which corresponds to 
linear approximation. We y f belonging to 
the Sobolev space W*? (RG, that is, the weak 
derivatives of f up to order s belong to L”(R4). The 
linear approximation of f at scale /, corresponding 
to the projection of f onto Vj, is then given by 


OMA EEUU EEE OOOO TT DO 
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Figure 5 Orthogonal wavelet decomposition using quintic 
spline wavelets: (a) function f(x) = sin (27x) for x < 1/4 and x > 
3/4 and f(x)— —sin(2zx) for 1/4 « x « 3/4 sampled on a grid 
x —i/27,i—0,...,2 — 1 with J=9 and (b) corresponding wavelet 
coefficients log,, |f; ;| for ¿=0,...,2/ — 1 and j—0,...,J — 1. 


al i 
=X > hits) [60] 


j=0 ¡ez 


The approximation error can be estimated by 
lf — fli, « Camis [61] 


where s denotes the smoothness of the function in 
L?,d the space dimension, and m the number of 
vanishing moments of the wavelet v. In the case of 
poor global regularity of f, that is, for small s, a 
large number of scales / is needed to get a good 
approximation of f. 

In Figure 6, we plot the linear approximation of 
the function f shown in Figure 5. The function fe is 
reconstructed using wavelet coefficients up to scale 
J-1=5, so that in total only 64 out of 512 
coefficients are retained. We observe an oscillating 
behavior of f; near the discontinuities of f which 
dominates the approximation error. 


Nonlinear Approximation 


Retaining the N largest wavelet coefficients in the 
wavelet expansion of f in [46], without imposing 
any a priori cutoff scale, yields the best N-term 
approximation f^. In contrast to the linear approx- 
imation [60], it is called nonlinear approximation, 
since the choice of the retained coefficients depends 
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Figure 6 (a) Linear approximation f, of the function f in 
Figure 5 for J —6, reconstructed from 64 wavelet coefficients 
using quintic splines wavelets and (b) corresponding wavelet 
coefficients log; |f,;| for /—0,...,2/ —1 and j—0,...,J — 1. 
Note that the coefficients for J > 5 have been set to zero. 


Logarithm 1.00E +00 


on the function f. The mathematical theory has been 
formalized by Cohen, Dahmen, and De Vore. 

The nonlinear approximation of the function f can 
then be written as 


Y fa Bal) [62] 
(J.D) € AN 


where Ay denotes the ensemble of all multi-indices 
A=(j,1), indexing the N largest coefficients (mea- 
sured in the /? norm), 


AN = {Az =1, N| IET > fu 


with  A=([(u=(5,1),>0,1€Z). The nonlinear 
approximation leads to the following error estimate: 


p Ved} [63] 


lf =F" lle CN [64] 


where s denotes the smoothness of f in the larger 
space L?(R7) with 


1 1. 4$ 

q p d 
which corresponds to the Sobolev embedding line 
(Figure 7). This estimate shows that the nonlinear 
approximation converges faster than the linear one, 
if f has a larger regularity in L?, that is, f € W*4 
(R4), which is for example the case for functions 
with isolated singularities and for small q. 
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Figure 7 Schematic representation of linear and nonlinear 
approximation. 


1/p 1/q=1/p+ t/d 


In Figure 8, we plot the nonlinear approximation 
of the function f shown in Figure 5. The function f 
is reconstructed using the strongest 64 wavelet 
coefficients out of 512 coefficients. Compared to 
the linear approximation (cf. Figure 6), the oscilla- 
tions around the discontinuities disappear and the 
approximation error is reduced while using the same 
number of coefficients. 


Compression and Preconditioning of Operators 


The nonlinear approximation of functions can be 
extended to certain operators leading to an efficient 


—4.00E +00 


Logarithm 1.00E +00 


(b) 


Figure 8 (a) Nonlinear approximation f" of the function f in 
Figure 5 reconstructed from the 64 largest wavelet coefficients 
using quintic splines wavelets, (b) retained wavelet coefficients 
log;o lfl for /—0,...,2/ — 1 and j=0,..., J — 1. 
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representation in wavelet space, that is, to sparse 
matrices. For integral operators, for example, 
Calderon—Zygmund operators T on R defined by 


Tf (x) = J K(x, y)f (y) dy [65] 
where the kernel k satisfies 
C 
JERIA S 
[R(x, y, )| = 
and 
ð ð E 
—k ' E = IJI < 
aaa [3 e| S 


their wavelet representation (Tw; ;,wvy,;) is sparse 
and a large number of weak coefficients can be 
suppressed by simple thresholding of the matrix 
entries while controlling the precision. The resulting 
numerical scheme is called BCR algorithm and is 
due to Beylkin et al. (1991). 

The characterization of function spaces by the 
decay of the wavelet coefficients and the corre- 
sponding norm equivalences can be used for 
diagonal preconditioning of integral or differential 
operators which leads to matrices with uniformly 
bounded condition numbers. For elliptic differential 
operators, for example, the Laplace operator V? the 
norm equivalence ||V?f|| = ||27/f;|| can be used for 
preconditioning the matrix (V?vy; i v, ;) by a simple 
diagonal scaling with 2^ to obtain a uniformly 
bounded condition number. For further details, we 
refer to the book of Cohen (2000). 


Wavelet Denoising 


We consider a function f which is corrupted by a 
Gaussian white noise n € N(0,0?). The noise is 
spread over all wavelet coefficients s, while, 
typically, the original function f is determined by 
only few significant wavelet coefficients. The aim is 
then to reconstruct the function f from the observed 
noisy signal s — f +n. 

The principle of the wavelet denoising can be 
summarized in the following procedure: 


e Decomposition. Compute the wavelet coefficients 
$, using the FWT. 

e Thresholding. Apply the thresholding function p. 
to the wavelet coefficients s,, thus reducing the 
relative importance of the coefficients with small 
absolute value. 

e Reconstruction. Reconstruct a denoised version sc 
from the thresholded wavelet coefficients using 
the fast inverse wavelet transform. 


The thresholding parameter ¢ depends on the 
variance of the noise and on the sample size N. 
The thresholding function p we consider corre- 
sponds to hard thresholding: 


p-(a) = Fr 


Donoho and Johnstone (1994) have shown that 
there exists an optimal & for which the relative 
quadratic error between the signal s and its 
estimator sc is close to the minimax error for all 
signals s € H, where H belongs to a wide class of 
function spaces, including Hólder and Besov spaces. 
They showed using the threshold 


if |a| > e 


if la] < € 66] 


ED = o, V 2n N [67] 


yields an error which is close to the minimum error. 
The threshold ep depends only on the sampling N 
and on the variance of the noise o,; hence, it is 
called universal threshold. However, in many 
applications, o,, is unknown and has to be estimated 
from the available noisy data s. For this, the present 
authors have developed an iterative algorithm (see 
Azzolini et al. (2005)), which is sketched in the 


following: 


1. Initialization 

(a) given s,,k=0,...,N — 1. Set ¿=0 and com- 
pute the FWT of s to obtain Sa; 

(b) compute the variance oj of s as a rough 
estimate of the variance of n and compute the 
corresponding threshold £o = (2 In Noj)'/?; 

(c) set the number of coefficients considered as 


noise Nnoise = N; 
2. Main loop repeat 
(a) set Noo... — Naose and count the wavelet 
coefficients Noyoise with modulus smaller 
than 2;; 
(b) compute the new variance c2,, from the 


wavelet coefficients whose modulus is smal- 
ler than e; and the new threshold e;,;— 


(2(In N)o?, a)"; 
(c) seti=1+1 until (NE... == Nose) 


3. Final step 
(a) compute sc from the coefficients with mod- 
ulus larger than e; using the inverse FWT. 


Example To illustrate the properties of the denoising 
algorithm, we apply it to a one-dimensional test signal. 
We construct a noisy signal s by superposing a 
Gaussian white noise, with zero mean and variance 
Oi —1, to a function f, normalized such that 
((1/N) >, If, |^)! 7 =10. The number of samples is 
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Figure 9 Construction (top) of a 1D noisy signal s — f + n (middle), and results obtained by the recursive denoising algorithm 


(bottom). 


N — 8192. Figure 9a shows the function f together 
with the noise n; Figure 9b shows the constructed 
noisy signal s and Figure 9c shows the wavelet 
denoised signal sc together with the extracted noise. 
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Main Definition 


WDVV equations of associativity (after E Witten, 
R Dijkgraaf, E Verlinde, and H Verlinde) is 
tantamount to the following problem: find a func- 
tion F(v) of n variables v — (v!,v?,...,v") satisfying 


the conditions [1], [3], and [4] given below. First, 


Fw) 
Ovi ova oy? = "8 E 


must be a constant symmetric nondegenerate matrix. 
Denote (7°?) — (nag) the inverse matrix and intro- 
duce the functions 


! O?F(v) 

! — FORT 
C g(v) m T] ðv dv Oy? ) Q, p, Y — 1. sees n [2] 
The main condition says that, for arbitrary 
v!,...,v" these functions must be structure con- 
stants of an associative algebra, that is, introducing 
a v-dependent multiplication law in the n-dimen- 
sional space by 

a: b (— (cis o)a^ b^, TT cg (r)a^ b^) 

one obtains an z-parameter family of n-dimensional 
associative algebras (these algebras will automati- 
cally be also commutative). Spelling out this condi- 
tion one obtains an overdetermined system of 
nonlinear PDEs for the function F(v) often also 
called WDVV associativity equations 


o? F( v) Àj 
Ov" Ov? Ov Ov! OvI Qv? 
_ ØF) ya Fv) 


O Ov? w^ 7 EXT Ov® 3] 


9? F(v) 


for arbitrary 1 < 0,8,y,6 <n. (Summation over 
repeated indices will always be assumed.) The last 
one is the so-called quasihomogeneity condition 


EF = (3 m d)F ia 1 Aggv*v” T Bv T C [4] 


where 
O 
E = (azu? + b^) — 
2 Qv? 
for some constants 25, b^ satisfying 
a = 01, b! =U 


Aas, Ba, C, d are some constants. E is called Euler 
vector field and d is the charge of the Frobenius 
manifold. 

For n=1 one has F(v) — (1/6)?. For 1 —2 one 
can choose 


F(u,v) =3uv? + f(u) 


only the quasihomogeneity [4] makes a constraint 
for f(v). The first nontrivial case is for 7 —3. The 
solution to WDVV is expressed in terms of a 
function f =f (x,y) in one of the two forms (in the 
examples all indices are written as lower): 


d#0: F=4vjv3+4uiv3 + f(v2,03) 
a = fyyy + fxxxfxyy [5] 
d=0: F=¿v] +v10203 +f(v2, 43) 


f — yyy — ¿ma = 1 


The function f(x, y) satisfies additional constraint 
imposed by [4]. Because of this the above PDEs [5] 
can be reduced (Dubrovin 1992, 1996) to a 
particular case of the Painlevé-VI equation (see 
Painlevé Equations). 

The problem [1], [3], [4] is invariant with respect 
to linear changes of coordinates preserving the 
direction of the vector 9/0v!: 


TUR p? - Pay? E gs. det(P9)4 0, ? = 6 


It is also allowed to add to F(v) a polynomial of the 
degree at most 2. To consider more general non- 
linear changes of coordinates one has to give a 
coordinate-free form of the above equations [1], [3], 
[4]. This gives rise to the notion of Frobenius 
manifold introduced in Dubrovin (1992). 

Recall that a Frobenius algebra is a pair (A,<,>), 
where A is a commutative associative algebra with a 
unity e over a field k (we will consider only the cases 
k=R,C) and c,» is a k-bilinear symmetric non- 
degenerate invariant form on A, that is, 


E EDS 2 
for arbitrary vectors x,y,z in A. 


Definition Frobenius structure (-,e,<,>,E,d) on 
the manifold M is a structure of a Frobenius algebra 
on the tangent spaces T,M=(A,,<,>,) depending 
(smoothly, analytically, etc.) on the point v € M. It 
must satisfy the following axioms. 

FM1. The curvature of the metric <,>, on M 
(not necessarily positive definite) vanishes. Denote V 
the Levi-Civita connection for the metric. The unity 
vector field e must be flat, Ve=0. 

FM2. Let c be the 3-tensor c(x,y,z):=<x-y, 
z>, x,y,z € T,M. The 4-tensor (V,,c)(x, y, z) must 
be symmetric in x,y,z, w € T,M. 

FM3. A linear vector field E € Vect(M) (called 
Euler vector field) must be fixed on M, that is, 
VVE — 0, such that 


Lieg(x- y) — Liegx - y — x : Liegy =x : y 
Lies <,>=1(2=4) <,> 


for some number d € k called “charge.” 


The last condition (also called quasihomogeneity) 
means that the derivations Orfune¡m):= E, Ovect(m):= 
id + ad; define on the space Vect(M) of vector fields 
on M a structure of graded Frobenius algebra over 
the graded ring of functions Func(M). 

Flatness of the metric <,> implies local existence 
of a system of flat coordinates v!,...,v" on M. 
Usually, they are chosen in such a way that 


o ð 

E77 
is the unity vector field. In such coordinates, the 
problem of local classification of Frobenius mani- 
folds reduces to the WDVV associativity equations 


[1], [3], [4]. Namely, nas is the constant Gram 
matrix of the metric in these coordinates 


ph des 
108 X aya 8v? 
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The structure constants of the Frobenius algebra 
Ay=T¿M 

ð ð P o 
ave By) T oa”) Bip 


[6] 


can be locally represented by third derivatives [2] of 
a function F(v) satisfying [1], [3], [4]. The function 
F(v) is called “potential” of the Frobenius manifold. 
It is defined up to adding of an at most quadratic 
polynomial in v',...,v”. 

A generalization of the above definition to the 
case of Frobenius supermanifolds can be found in 
Manin (1999). For the more general class of the 
so-called F-manifolds, the requirement of the 
existence of a flat invariant metric has been relaxed. 


Deformed Flat Connection 


One of the main geometrical structures of the theory 
of Frobenius manifolds is the deformed flat connec- 
tion. This is a symmetric affine connection on M x 
C* defined by the following formulas: 


~ 


Vey =VxY+2x "Y, x,yETM,zEeC 


s l 1 
Vqjagy = Oy + E - y — z [7] 
- d " d 
Vez = =- = 0 
dz^ uda. 


where, as above, V is the Levi-Civita connection for 
the metric <,> and 


y: 27 f. VE [8] 


is an operator on the tangent bundle TM antisym- 
metric with respect to <, >, 


cyx,y22— <x, Vy> 


Observe that the unity vector field e is an eigen- 
vector of this operator with the eigenvalue 


ye = =e 


The connection V = V(z) is not metric but it satisfies 


V <x,y>=<V(—z)x,y> + «x, V(z)y> 
x,y € TM 


for any z € C'. As it was discovered in Dubrovin 
(1992), vanishing of the curvature of the connection 
V is essentially equivalent to the axioms of 
Frobenius manifold. 
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Definition A “deformed flat function” f(v;z) on a 
domain in M x C" is defined by the requirement of 
horizontality of the differential df 


Vdf =0 [9] 


Due to vanishing of the curvature of V locally 
there exist n independent deformed flat functions 
fi(v;z),...,fu(v;z) such that their differentials, 
together with the flat 1-form dz, span the cotangent 
plane Ti, „(M x C"). They will be called “deformed 
flat coordinates." The global analytic properties of 
deformed flat coordinates can be derived, for the 
case of semisimple Frobenius manifolds, from the 
results of the section “Moduli of semisimple 
Frobenius manifolds” discussed later. 

One can relax the definition of Frobenius manifold 
dropping the last axiom FM3. The potential F(v) in 
this case satisfies [1] and [3] but not [4]. In this case, 
the deformed flat connection V is just a family of 
affine flat connections on M depending on the 
parameter z € C given by the first line in [7]. The 
curvature and torsion of this family of connections 
vanishes identically in z. The deformed flat functions 
of V defined as in [9] can be chosen in the form of 
power series in z. The flatness equations written in the 
flat coordinates on M yield a recursion equation for 
the coefficients of these power series 


Vdf=0, f= S Oplo) 


p>0 
040, f = ZC, (v)O,f 


9,0,,80(v) = 0 | - 10 
Dpp (V) = cy, (v)O,05(v) 
Thus, f(v;0) is just an affine linear function of the 
flat coordinates v!,...,v”; the dependence on z can 
be considered as a deformation of the affine 
structure. This motivates the name “deformed flat 
coordinates.” The coefficients of the expansions of 
the deformed flat coordinates are the leading terms 
of the s-expansion of the Hamiltonian densities 
of the integrable hierarchies associated with the 
Frobenius manifolds (see below). 


Intersection Form of a 
Frobenius Manifold 


Another important geometric structure on M is the 
intersection form of the Frobenius manifold. It is a 
symmetric bilinear form on the cotangent bundle 
T*M defined by the formula 


w,w € TI' M [11] 


(wi, w2) = 1fW1 - wz, 


Here the multiplication law on the cotangent planes 
is defined by means of the isomorphism. 


<,>: TM= T*M 


The discriminant X C M is a proper analytic (for an 
analytic M) subset where the intersection form 
degenerates. One can introduce a new metric on 
the open subset MX taking the inverse of the 
intersection form. A remarkable result of the theory 
of Frobenius manifolds is vanishing of the curvature 
of this new metric. Moreover, the new flat metric 
together with the following new multiplication: 


x*yx- y. E! 
defines on MIX a structure of an almost-dual 
Frobenius manifold (Dubrovin 2004). In the original 
flat coordinates v!,...,v" the coordinate expressions 
for the new metric and for the associated Levi-Civita 


connection V*, called the Gauss-Manin connection, 
read 


g^" (v) := (dv*, dv”) = E7(v)c*"(v) 
V** dv? = MC) di^ 
| | »—* 
5^0): - e"tor?,0) = (0) (5 -V) 


€ 


[12] 


The pair (,) and < , > of bilinear forms on T*M 
possesses the following property crucial for under- 
standing the relationships between Frobenius mani- 
folds and integrable systems: they form a flat pencil. 
That means that on the complement to the subset 


Dy:= (v € M|det(g°” (v) — Ay”) = 0} 
The inverse to the bilinear form 
hte (-3«,5 [13] 


defines a metric with vanishing curvature. Flat 
functions p= p(v; A) for the flat metric are deter- 
mined from the system 


(V* — AV) dp = 0 [14] 


They are called *periods" of the Frobenius manifold. 
The periods p(v; A) are related to the deformed flat 
functions f(v; z) by the suitably regularized Laplace- 
type integral transform 

dz 


p(v; A) =| Ai F [15] 


Choosing a system of n independent periods, one 
obtains a system of flat coordinates pl(v; A),..., 
p" (v; A) for the metric (,), on M\ E), 


(dp'(v; A), dp/(v; A)) = G” [16] 


for some constant nondegenerate matrix G". 


The structure of a flat pencil on the Frobenius 
manifold M gives rise to a natural Poisson pencil 
(=bi-Hamiltonian structure) on the infinite-dimen- 
sional “manifold” £(M) consisting of smooth maps 
of a circle to M (the so-called loop space). In the flat 
coordinates v!,...,v" for the metric <,> the 
Poisson pencil has the form 


(v^ (x), v^ (y) m 0^ (x — y) 
(v^ (x), v^ (y) ja — g"" (v(x))8 (x — y) [17] 
+r (v(x))vi6(x — y) 


By definition. of the Poisson pencil, the linear 
combination a;{,}; + 42{,}, of the Poisson brackets 
is again a Poisson bracket for arbitrary constants 
a,,a2. Choosing a system of n independent periods 
p'(v; A), i—1,...,n, as a new system of dependent 
variables, one obtains a reduction of the Poisson 
bracket {,},:={,}, —A{,}, for a given A to the 
canonical form 


{p'(u(x):A), p wO) A) = 60 (xy) — [18] 


Under an additional assumption of existence of tau 
function (Dubrovin 1996, Dubrovin and Zhang), 
one can prove that any Poisson pencil on £(M) of 
the form [17] with a nondegenerate matrix (7?) 
comes from a Frobenius structure on M. 


Canonical Coordinates on Semisimple 
Frobenius Manifolds 


Definition The Frobenius manifold M is called 
semisimple if the algebras T, M are semisimple for 
v belonging to an open dense subset in M. 


Any z-dimensional semisimple Frobenius algebra 
over C is isomorphic to the orthogonal direct sum of 
n copies of one-dimensional algebras. In this section, 
all the manifolds will be assumed to be complex 
analytic. 

Near a semisimple point, the roots s/;— u;(v), 
i — 1,...,7, of the characteristic equation 


det(g^^(v) — Aj?) 20 [19] 


can be used as local coordinates. The vectors 
0/0u;,i=1,...,n, are basic idempotents of the 
algebras T, M 


o O0 ð 
Ou; Ou; "Ou, 
We call 4;,...,4, “canonical coordinates." Observe 
that we violate the indices convention labeling the 
canonical coordinates by subscripts. We will never 
use summation over repeated indices when working 
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in the canonical coordinates. Actually, existence of 
canonical coordinates can be proved without using 
[4] (see details in Dubrovin (1992)). 

Choosing locally branches of the square roots 


Wir (u):= V «9/0uj, 0/0uj7, 


we obtain a transition matrix Y = (w;,(u)), 


O = : Vio (M) O 


iS 1,...,;8 [20] 


due E: win (u) Ou; 21] 
from the basis 9/0v” to the orthonormal basis 
(fi. fi) = Ós; 
ð 
— .h-71 is. dA 
fi = Pı] dr 
ð |22] 
—". ERE 
h = urn 


Ó 
— 241 
fa = da 00) 5 


The matrix W(u) satisfies orthogonality condition 


W'(u)W(u)-7, n= (Mag),  "eg:— (25) 
In this formula W* stands for the transposed matrix. 
The lengths [20] coincide with the first column of 
this matrix. 

Denote V(u) —(Vi(u)) the matrix of the antisym- 
metric operator Y [8] with respect to the orthonor- 
mal frame 


V(u):— V(u) VU! (u) [23] 


The antisymmetric matrix V(u)=(V;(u)) satisfies 
the following system of commuting time-dependent 
Hamiltonian flows on the Lie algebra so(z) 
equipped with the standard Lie—Poisson brackets 
(Vis Vir} = Viróje — Vjibig + Vind — Vind: 


ax VO HO; == lucia 24] 
with quadratic Hamiltonians 
1 Vi 
H;(V;u) == 4 [25] 
2 » Ui — Uj 


The matrix W(u) satisfies 


[26] 
Vi(u) :— adp, adj (V(u)), i=1,...,n 


Here the matrix unity E; has the entries (Ej), = 
65,6;;,U = diag(u1,...,u,). Conversely, given a solu- 
tion to [24] and [26], one can reconstruct the 
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Frobenius manifold structure by  quadratures 
(Dubrovin 1998). The reconstruction depends on a 
choice of an eigenvector of the constant matrix 
y-wv-(u) V(u)W(u). 

The system [24] coincides with the equations of 
isomonodromic deformations (see Isomonodromic 
Deformations) of the following linear differential 
operator with rational coefficients: 


3 - (uU) [27] 


The latter is nothing but the last component of the 
deformed flat connection [7] written in the ortho- 
normal frame [22]. Other components of the 
horizontality equations yield 


AY = (zB + VARY, i1... [28] 


The compatibility conditions of the system [27] and 
[28] coincide with [24]. 

The integration of [24], [26] and, more generally, 
the reconstruction of the Frobenius structure can be 
reduced to a solution of a certain Riemann—Hilbert 
problem (see Riemann-Hilbert Problem). 

The isomonodromic tau function of the semisim- 
ple Frobenius manifold is defined by 


dlog7(u) = y H;(V(u); u)du; [29] 
i=1 


It is an analytic function on a suitable unramified 
covering of the semisimple part of M. 

Alternatively, eqns [24] can be represented as the 
isomonodromy deformations of the dual Fuchsian 
system 


iu -A& - (5* v) [30] 


The latter comes from the Gauss-Manin system for 
the periods p —p(v; A) of the Frobenius manifold 
written in the canonical coordinates [22]. 


Moduli of Semisimple 
Frobenius Manifolds 


All n-dimensional semisimple Frobenius manifolds 
form a finite-dimensional space. They depend on 
n(n — 1)/2 essential parameters. To parametrize the 
Frobenius manifolds one can choose, for example, 
the initial data for the isomonodromy deformation 
equations [24]. Alternatively, they can be parame- 
trized by monodromy data of the deformed flat 
connection according to the following construction. 

The first part of the monodromy data is the 
spectrum (V, < , >, ji, R) of the Frobenius manifold 
associated with the Poisson pencil. Here V is an 


n-dimensional linear space equipped with a sym- 
metric nondegenerate bilinear form <,>. Two 
linear operators on V, a semisimple operator 
ji: V — V, and a nilpotent operator R: V— V must 
satisfy the following properties. First, the operator / 
Is antisymmetric: 


fj =—fp [31] 


and the operator R satisfies 
R* = —e MR ei [32] 


Here the adjoint operators are defined with respect 
to the bilinear form < , >. The last condition to be 
imposed onto the operator R can be formulated in a 
simple way by choosing a basis ei,...,e, of 
eigenvectors of the semisimple operator /1, 


UB = Mañas Q= liz, n 
We require the existence of a decomposition 
R = Ro+ Ri + R2 +-+- [33] 


where for any integer k > 0 the linear operator Ry 
satisfies 


Rye, €span(ea|ua — uo +k} Va=1,....n [34] 


In the nonresonant case, such that none of the 
differences of the eigenvalues of ji being equal to a 
positive integer, all the matrices R;, R»,..., are equal 
to zero. Observe a useful identity 


zRz " = Ro +zRı + ZR ge eid [35] 


More generally, for any operator A: V— V com- 
muting with e^?" a decomposition is defined as 


A= @ [A]; 
^l 7 
Az? Y E [A], Ed 
kez, 


In particular, [R]; = Rg, k > 0, [R], =0, k < 0. 

One has to also choose an eigenvector e of the 
operator ñ such that Rye=0; denote —d/2 the 
corresponding eigenvalue 


ec V, Roe — 0 [37] 
The second part of the monodromy data is a pair 
of linear operators 


C: V =C”, S: C” C" 

The space C” is assumed to be equipped with the 
standard complex Euclidean structure given by 
the sum of squares. The properties of the operators 
S, C depend on the choice of an unordered set 


u® = (u?,..., u?) of m pairwise distinct complex 
numbers and on a choice of a ray /, on an auxiliary 
complex z-plane starting at the origin such that 


Rez(u) - u?) £0, izj zee, [38] 


Let us order the complex numbers in such a way that 


z(u? —1 , 47) 
^ 


ei") 0, iejl|-—oo ze£, BI 


The operator $ must be upper triangular 


S = Sii ; Si; = 0, is 
(Si) i taf 140] 
Sii = 1, $= Tico MB 
The operator C must satisfy 
9 47 = e e™R [41] 


Here the adjoint operator C* is understood as 
follows: 


T 95,7 
Cr- [5 3 CA Er y* =$ V 


The group of diagonal n x n matrices 
D = diag (=:1,)..: 251) 
acts on the pairs (S, C) by 


S DSD, Co DC 


One is to factor out the action of this diagonal 
group. Besides, the operator C is defined up to a left 
action of certain group of linear operators depend- 
ing on the spectrum. 


For the generic (i.e., 
mij 


nonresonant) case where 
has simple spectrum, the operator C is defined 
a to left multiplication by any matrix commuting 
with e27'%, In this situation, the monodromy data 
(fi, R, S, C) are locally uniquely determined by the 
n(n — 1)/2 entries of the matrix S. Therefore, near a 
generic point, the variety of the monodromy data is 
a smooth manifold of the dimension n(n — 1)/2. At 
nongeneric points, the variety can get additional 
strata. 

The monodromy data $, C are determined at an 
arbitrary semisimple point of a Frobenius manifold 
in terms of the analytic properties of horizontal 
sections of the deformed flat connection V [7] in the 
complex z-plane (the so-called “Stokes matrix” and 
the “central connection matrix” of the operator 
[27]). Locally, they do not depend on the point of 
the semisimple Frobenius manifold (the isomono- 
dromicity property). 

We will now describe the reconstruction procedure 
giving a parametrization of semisimple Frobenius 
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manifolds in terms of the monodromy data (ji, R 
S C. 

Conversely, to reconstruct the Frobenius manifold 
near a semisimple point with the canonical coordi- 
nates 40,...,49, one is to solve the following 
boundary-value problem. Let 


(=(-£)UL, 


be the oriented line on the complex z-plane chosen 
as in [38]. Here the ray £. is the opposite to £,. 
Denote IIg /IIj the right/left half-planes with respect 
to 4. To reconstruct the Frobenius manifold, one is 
to find three matrix-valued functions Do(z;u), 
Og (2; u), and 9i (z;u): 


$o(z;4):V — C" 
Pp ji (z; 4): C" — C" 


for u close to u? such that Po(z;u) is analytic and 
invertible for z € C, ®p(z;u)/®,(z;u) are analytic 
and invertible for z € IIg/IIj resp., and continuous 
up to the boundary /\0 and 


$g/L (2; u) ~ 1+ O(1/z), |z| — oo, z € Ip /TI, 

The boundary values of the functions 
Dolz34),Pr (234), and ®ı(z;u) must satisfy the 
following boundary-value problem (as above 
U = diag(u1, ..., &n)): 


Prízzu) = Pulz uje "Ser", zel, [42] 
$n(z;u) = 41(z;u)e? P S*e *", zee [43] 
$o(z;u)z"z5 = g(z;u)? C, zer 144 
@o(z;u)z"z® = 6, (zu) SC, ze 

Here z/:— e/loz ¿R:— eRlogz are considered as 


Aut(V)-valued functions on the universal covering 
of C\0; the branch cut in the definition of logz is 
chosen to be along ¢_. 

The solution of the above boundary-value pro- 
blem [42]-[44], if exists, is unique. It can be reduced 
to a certain Riemann-Hilbert problem, that is, to a 
problem of factorization of an analytic nxn 


nondegenerate  matrix-valued function on the 
annulus 
G(zu), r«l|z « R, det G(z;u) #0 
depending on the parameter u= (ui,..., un) in a 
product 
G(z;u) = Go(z;u) "Go (2;m) [45] 
of two matrix-valued functions Go(z; u) and 


G(z3; u) analytic for |z| < R and r < |z| € oo resp., 
with nowhere-vanishing determinant. 
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Existence of a solution to the Riemann—Hilbert 
problem for a given u = (u1,..., Un), ui Æ uj; for i £ j, 
means triviality of certain n-dimensional vector 
bundle over the Riemann sphere with the transition 
functions given by G(z;z). Existence of the solution 
for u=u° implies solvability of the Riemann- 
Hilbert problem for u sufficiently close to 4%. From 
these arguments, it can be deduced that the matrices 
Po(z; u), gji(2;4) are analytic in (z;u) for u 
sufficiently close to 4?. Moreover, they can be 
analytically continued in z to the universal covering 
of the space of configurations of n distinct points on 
the complex plane: 


(CAU (a = uj} )/Sn [46] 


The resulting functions are meromorphic on the 
universal covering, according to the results of 
B Malgrange and T Miwa. The structure of the 
global analytic continuation is given (Dubrovin 
1999) in terms of a certain action of the braid group 


B, = mi ((C'Nuigi(ui = uj) )/S«) 


on the monodromy data. 


Examples of Frobenius Manifolds 


Example 0 Trivial Frobenius manifold, M = A, a 
graded Frobenius algebra, F(v) 2 (1/6) <e,v-v-v> 
is a cubic polynomial. 


First nontrivial examples appeared in the setting 
of 2D topological field theories (Dijkgraaf et al. 
1991, Witten 1991) (see Topological Quantum Field 
Theory: Overview). Mathematical formalization of 
these ideas gives rise to the following two classes of 
examples. 


Example 1 Frobenius structure on the base of an 
isolated hypersurface singularity. The construction 
(Hertling 2002, Sabbah 2002) uses the K Saito 
theory of periods of primitive forms. For the 
example of A, singularity f(x) —x"*! the Frobenius 
structure on the base of universal unfolding 


Ma, = (f(x) ox" *1 + six" 3 +--+ +59 |s1,..., 59 EC) 


is constructed as follows (Dijkgraaf et al. 1991): 


- O 
OS» 

1 O 
= us 
n 12 i Sk as. 
NL 


|. 5241 


The multiplication is introduced by identifying the 
tangent space T¿M with the quotient algebra 


T;Ma, -— C[x]/ (f (x)) 
The metric has the form 
Ofs(x)/0s¡0fs(x)/0sj de 
fox) 


The flat coordinates v, =v,(s) can be found from 
the expansion of the solution to the equation 


fs(x) = ii 


1 ME Vez v] 1 
-a AS + O( gas) 
The potentials of the Frobenius manifolds Ma, for 
n=1,2,3 read 


< 0,,0, >= —(n + 1) res,— 


xk 


6 
Ba, —iviv t3 [47] 


The space of polynomials Ma, can be identified with 
the orbit space of C/W(A,,) of the Weyl group of the 
type A,. More generally (Dubrovin 1996), the orbit 
space My :— C"/W of an arbitrary irreducible finite 
Coxeter group W C O(n) carries a natural structure 
of a polynomial semisimple Frobenius manifold. 
Conversely, all irreducible polynomial semisimple 
Frobenius manifolds with positive degrees of the flat 
coordinates can be obtained by this construction 
(Hertling 2002). Generalizations for the orbit spaces 
of certain infinite groups were obtained in Dubrovin 
and Zhang (1998b) and Bertola (2000). 


Example 2 Gromov-Witten (GW) invariants (see 
Topological Sigma Models). Let X be a smooth 
projective variety. We will assume for simplicity that 
Hodd(X)=0. To every such variety, one can associ- 
ate a bunch of rational numbers. They are expressed 
in terms of intersection theory of certain cycles on 
the moduli spaces X,,,5 of stable genus g and 
degree 3 curves on X with m marked points (see 
details in Kontsevich and Manin (1994)): 


Koma = tf 2 Cy, xn os ay Mom) + X, 


f.[C,] = 8 € H(X; Z)) i 


Denote n:= dim H*(X; C). Choosing a basis $4 = 1, 
$2,...,04, we define the numbers 


< Tp, (Da, ) ee Thm (Dem) > g,3 
=| evi (Da) ^ di (£1) 
[Xem] 
Ne Nein) NE" (Lm) — [49 


for arbitrary non-negative integers P1,...,Pm. Here 
the evaluation maps ev;,i=1,...,m, are given by 


ff (xi) 


The so-called tautological line bundles £; over X, ,, 4 
by definition have the fiber Tí C,,i= 1,...,7 (see 
the article Moduli Spaces: An Introduction regarding 
the construction of the so-called virtual fundamental 
class [Xgm,3]""). The numbers [49] can be defined 
for an arbitrary compact symplectic manifold X 
where one is to deal with the intersection theory on 
the moduli spaces of pseudoholomorphic curves 
fixing a suitable almost-complex structure on X. 
They depend only on the symplectic structure on X. 
In particular, the numbers 


« T0(da, ) ett TO( s, ) > g 3 [50] 


are called the genus g and degree 9 GW invariants of 
X. In certain cases, they admit an interpretation in 
terms of enumerative geometry of the variety X 
(Kontsevich and Manin 1994). The numbers [49] 
with some of p;>0 are called “gravitational 
descendents.” 


eVi: X gm. — X, 


One can form a generating functions of the 
numbers [49] 


1 
- A ¿Ap Am: Pm 
HE Y ate 
m 3cH3;(X:Z) 
< Thi (Da) pahis Tp, (Prim ) >g B [51] 


(summation over repeated indices 1 < o1,...,0, € 
n will always be assumed). Here 1^? are indetermi- 
nates labeled by pairs (a,p) with a=1,...,2, 
p=0,1,2,.... (Usually one is to insert in the 
definition of F “a elements q^ of the Novikov ring 
C[H2(X;Z)]. However, due to the divisor axiom 
(Kontsevich and Manin 1994) and these insertions 
can be compensated by a suitable shift in the space 
of couplings t = (1%?).) We finally introduce the full 
generating function called total GW potential (it is 
also called the free energy of the topological sigma 
model with the target space X) 


F* (te) = Y puru [52] 


g20 


Restricting the genus-zero generating function 
onto the so-called small phase space 


F* y):— TX [29 = ya ¿ap>0 y 
(v):— Fo ( ! ) (53 


one obtains a solution to the WDVV associativity 
equations. This solution defines a structure of 
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(formal) Frobenius manifold on H*(X) with the 
bilinear form 7 given by the Poincaré pairing 


Mas = J Da A O 
X 


the unity 


and the Euler vector field 


de ð 
E = (1 = qa)v^ Rs Fal reme 
» Qv 


Here the numbers q4q,,r, are defined by the 


conditions 


ba € H^(X), | a(X) 2 5 rado 


Q 


The resulting Frobenius manifold will be denoted 
Mx. The corresponding n-parameter family of 
n-dimensional algebras on the tangent spaces T, Mx 
is also called “quantum cohomology” QH'(X). At 
the point v4 € Mx of classical limit, the algebra 
T,, Mx coincides with the cohomology ring H*(X). 
In all known examples, the series [53] actually 
converges in a neighborhood of the point va. 
Therefore, one obtains a genuine Frobenius structure 
on a domain Mx C H'(X;C)/2zsiH5(X; Z). How- 
ever, a general proof of convergence is still missing. 

In particular, for d — 1, the quantum cohomology 
of complex projective line P! is a two-dimensional 
Frobenius manifold with the potential, unity, and 
the Euler vector field 


F(u, v) = tuv? e", 


Q9 
Ov’ 
ð O 


For d=2 one has a three-dimensional Frobenius 
manifold OH*(P?) with 


F(vi,v»,v3)— ivtv3 +3013 


v 
+ N; 3 kv»; 
2^ t(3k-1) 
k [54] 
7 
E-—viL—- 3 á 


—— ee 
Ov) dv») = Qva 


where N; — number of rational curves on P^ passing 
through 3k—1 generic points. WDVV [5] yields 
(Kontsevich and Manin 1994) recursion relations for 
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the numbers N, starting from N;=1. The closed 
analytic formula for the function [54] is still unknown. 

Only for certain very exceptional X the Frobenius 
manifold Mx is semisimple (e.g., for X — P^). The 
general geometrical reasons of the semisimplicity of 
My are still to have been understood. 

For the case X =Calabi-Yau manifold, the Fro- 
benius manifold OH*(X) is never semisimple. This 
Frobenius structure can be computed in terms of the 
mirror symmetry construction (see Mirror Symme- 
try: A Geometric Survey). 


Frobenius Manifold and Integrable 
Systems 


The identities in the cohomology ring generated by 
the cocycles ev;(ó,) and w;:=c,(L;) can be recast 
into the form of differential equations for the 
generating function [52]. The variable x:—:'^? 
corresponding to ¢;=1 plays a distinguished role 
in these differential equations. According to the idea 
of Witten (1991), the differential equations for the 
generating functions can be written as a hierarchy of 
systems of n evolutionary PDEs (n = dim H*(X)) for 
the unknown functions 


,9 F^ (t,e) 


Wa = (T0(¢a)T0(¢1))) = adan [$5] 


The variable x is the spatial variable of the 
equations of the hierarchy. The remaining para- 
meters (coupling constants) 7^ of the generating 
function play the role of the time variables. Witten 
suggested to use the two-point correlators 

9? F* (t, c) 


Pa. p = (pii (G4 )To(61))) = deo [56] 


as the densities of the Hamiltonians of the flows of 
the hierarchy. 

Existence of such a hierarchy can be proved for 
the case of GW invariants (and their descendents) 
of complex projective spaces (the results of 
Givental (2001) along with Dubrovin and Zhang 
(2005) can be used). For d=0 one obtains, 
according to the celebrated result by Kontsevich 
conjectured by Witten (see Topological Gravity, 
Two-Dimensional), the tau function of the solution 
to the KdV hierarchy (see Korteweg-de Vries Equation 
and Other Modulation Equations) specified by the 
initial condition, 


-— 


u(x)|,-o—x 


For d — 1 the hierarchy in question is the extended 
Toda lattice (see details in Dubrovin and Zhang 
(2004); see also Toda Lattices). For all other d > 2, 


the needed integrable hierarchy is a new one. It can 
be associated (Dubrovin and Zhang) with an arbi- 
trary n-dimensional semisimple Frobenius manifold 
M. The equations of the hierarchy have the form 


XXX 


w= Ai(w)w +e | Bi(w)wi a. + Cy (wich wh, 

+ Di, (w)wiw! tu] +(e NUNC. 
The coefficients of e%£ are graded homogeneous 
polynomials in ttx, xx, etc., of the degree 2g¢+ 1, 

deg d" u/dx" = m 


The construction of the hierarchy is done in two 
steps. First, we construct the leading approximation 
(Dubrovin 1992). The equation of the hierarchy 
specifying the dependence on ¿=P at e=0 reads 


Qv 
=el De 0. ) - 
Oto O, (V x, pii(v)) [58] 
^; ac PRIN n, p>0 


The functions 6,,,(v),v € M, are the coefficients of 
expansion [10] of the deformed flat functions 
normalized by 0,9 —v,. The solution v —v(x,t) of 
interest is determined from the implicit function 
equations 


y — xe + Ss PV Bop (v) [59] 
a.p 


Next, one has to find solution 
AP. STE us....,998 9) [60] 
of the following universal loop equation (closely 


related with the Virasoro conjecture of Eguchi and 
Xiong (1998)): 


as &( 1 ) 
m Qv" *NE(v) — A 


í r 
=A 2. E "Op, GSH pa 


r>1 k=1 k 
A Ny teju- ay] 
"1 


16 
A O AT i's OAF 
TY Qv: SuvkByel * Qv ^ Oy»! 


x ls 


OAF y 
+3 Xu 
OPalv; A) OPADA) P aed 


Here U= U(v) is the operator of multiplication by 
E(v), Pa =Palv3A),a=1,...,”, is a system of flat 
coordinates [16] of the bilinear form [13]. The 
substitution 


Va — Wa = Va + € O Op o AF (v; Vg, Vex, 3€) (62 

"a MEI. 
transforms [58] to [57]. The terms of the expansion 
[60] are not polynomial in the derivatives. For 
example (Dubrovin and Zhang 19982), 


- 1 « / T(t) 
71-342. 1084, + 108777347, 
Ó Q 


J(u) = dee( 5) = = [Jato 


(the canonical coordinates have been used) where 
T¡(u) is the isomonodromic tau function [29]. The 
transformation [62] applied to the solution [59] 
expresses higher-genus GW invariants of a variety X 
with semisimple quantum cohomology OH*(X) via 
the genus-zero invariants. For the particular case of 
X=P?, the formula [63] yields (Dubrovin and 
Zhang 1998a) 


$" —27 _ 3 (1) 
8274289 — 30") gt DAN 


[63] 


ek? 
(3k)! 
Here 


kz 
6(z) = Y Nea 
2. * (3k — 1)! 


is the generating function of the genus-zero GW 
invariants of P? (see [54]) and Ny = the number of 
elliptic plane curves of the degree k passing through 
3k generic points. 


See also: Bi-Hamiltonian Methods in Soliton Theory; 
Functional Equations and Integrable Systems; Integrable 
Systems: Overview; Isomonodromic Deformations; 
Korteweg-de Vries Equation and Other Modulation 
Equations; Mirror Symmetry: A Geometric Survey; Moduli 
Spaces: An Introduction; Painlevé Equations; 
Riemann-Hilbert Problem; Toda Lattices; Topological 
Gravity, Two-Dimensional; Topological Quantum Field 
Theory: Overview; Topological Sigma Models. 
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Introduction 


Practically any physical, chemical, or biological 
system can exhibit rhythmic oscillatory activity, at 
least when the conditions are right. Winfree (2001) 
reviews the ubiquity of oscillations in nature, 
ranging from autocatalytic chemical reactions to 
pacemaker cells in the heart, to animal gates, and to 
circadian. rhythms. When coupled, even weakly, 
oscillators interact via adjustment of their phases, 
that is, their timing, often leading to synchroniza- 
tion. In this chapter, we review the most important 
concepts needed to study and understand the 
dynamics of coupled oscillators. 

From a mathematical point of view, an oscillator 
is a dynamical system, 


=$, ER” [1] 


having a limit-cycle attractor — periodic orbit y C R”. 
Its period is the minimal T > 0 such that 


y(t) = y(t - T) 


and its frequency is Q=27/T. Let x(0) =xp € y be 
an arbitrary point on the attractor, then the state of 
the system, x(t), is uniquely defined by its phase 
Y cS relative to xo, where S” is the unit circle. 

Throughout this article, we assume that the 
periodic orbit y is exponentially stable, which 
implies normal hyperbolicity. In this case, there is a 
continuous transformation O: U — S! defined in a 
neighborhood U > y such that (t) = O(x(t)) for any 
trajectory in U, that is, O maps solutions of [1] to 
solutions of 


for any t 


v= [2] 


Such a transformation removes the amplitude but 
saves the phase of oscillation. 

Accordingly, there is a continuous transformation 
that maps solutions of the weakly coupled network 
of n oscillators, 


X; — fi(Xx4) FER iaee), e < 1 [3] 
onto solutions of the phase system 
0; = Q; + &bj(01, ; 


..,0n,€), WE s! [4] 


which is easier for studying the collective properties 


of [3]. 


===, 


——— 


Identify 


as 


(a) (b) (c) 
Figure 1 A 2-torus and its representation on the square. 
(Modified from Hoppensteadt and Izhikevich 1997.) 


Frequency locking 


Entrainment 
(1:1 frequency locking) 


Synchronization 


Figure 2 Various degrees of locking of oscillators. (Modified 
from Izhikevich 2006.) 


The oscillators are said to be frequency locked when 
[4] has a stable periodic orbit V(t) = (01(1),. .. 0, (£)) 
on the z-torus T”, as in Figure 1a. The “rotation 
vector" or *winding ratio" of the orbit is the set of 


while Y, makes q2 rotations, etc., as in the 2: 3 
frequency locking in Figure 1a. The oscillators 


locked. The oscillators are phase locked when there is 
an (n—1)x ^ integer matrix K having linearly 
independent rows such that K?(t) = const. For exam- 
ple, the two oscillators in Figure 1b are phase locked 
with K — (2, 3), while those in Figure 1c are not. The 
oscillators are synchronized when they are entrained 
and phase locked. Synchronization is in-phase when 
Ü1(£) — --- =0n(t) and out-of-phase otherwise. Two 
oscillators are said to be synchronized antiphase when 
V(t) — J2(t) =a. Frequency locking without phase 
locking, as in Figure 1c, is called phase trapping. The 
relationship between all these definitions is depicted 
in Figure 2. 


Phase Resetting 


An exponentially stable periodic orbit is a normally 
hyperbolic invariant manifold, hence its sufficiently 
small neighborhood, U, is invariantly foliated by 


Andronov—Hopf oscillator 


—1.5 


-15 -1 -05 © 0.5 1 1.5 
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van der Pol oscillator 


—2 
-15 -1 -05 0 0.5 1 1.5 
X 


Figure 3 Isochrons of Andronov—Hopf oscillator (z — (1 + i)z — z|z|°, z € C) and van der Pol oscillator (x =x — x? — y, y = x). 


stable submanifolds (Guckenheimer 1975) illustrated 
in Figure 3. The manifolds represent points having 
equal phases and, for this reason, they are called 
isochrons (from Greek “iso” meaning equal and 
“chronos” meaning time). 

The geometry of isochrons determines how the 
oscillators react to perturbations. For example, the 
pulse in Figure 3, right, moves the trajectory from 
one isochron to another, thereby changing its phase. 
The magnitude of the phase shift depends on the 
amplitude and the exact timing of the stimulus 
relative to the phase of oscillation Y. Stimulating the 
oscillator at different phases, one can measure the 
phase transition curve (Winfree 2001) 


Ünew — PTC(Vora) 
and the phase resetting curve 


PRC(9) = PTC(9) — Y 
(shift = new phase — old phase) 


Positive (negative) values of the PRC correspond to 
phase advances (delays). PRCs are convenient when 
the phase shifts are small, so that they can be 
magnified and clearly seen, as in Figure 4. PTCs are 
convenient when the phase shifts are large and 
comparable with the period of oscillation. 


Andronov—Hopf oscillator van der Pol oscillator 


Stimulus phase, 0 


Stimulus phase, 6 


Figure 4 Examples of phase response curves (PRCs) of the 
oscillators in Figure 3. PRC4,(8) and PRC2(9) correspond to 
horizontal (along the first variable) and vertical (along the second 
variable) pulses with amplitudes 0.2. An example of oscillation is 
plotted as a dotted curve in each subplot (not to scale). 


In Figure 5 we depict phase portraits of the 
Andronov-Hopf oscillator receiving pulses of 
magnitude 0.5 (left) and 1.5 (right). Notice the 
drastic difference between the corresponding PRCs 
or PTCs. Winfree (2001) distinguishes two cases: 


1. type 1 (weak) resetting results in continuous PRCs 
and PTCs with mean slope 1, and 

2. type 0 (strong) resetting results in discontinuous 
PRCs and PTCs with mean slope 0. 


Type 1 (weak) resetting 


Type 0 (strong) resetting 


PRC(0) 


Phase resetting 
o 

Phase resetting 
o 


0 2n 0 21 
Stimulus phase, 0 Stimulus phase, 0 


PTC(0) = 
[0+ PRC(0)) mod 2x 


Phase transition 
Phase transition 


0 2x 
Stimulus phase, 6 


Stimulus phase, 0 


Figure 5 Types of phase resetting of the Andronov—Hopf 
oscillator in Figure 3. 
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The discontinuity of type 0 PRC in Figure 5 is a 
topological property that cannot be removed by 
reallocating the initial point xo that corresponds to 
zero phase. The discontinuity stems from the fact 
that the shifted image of the limit cycle (dashed 
circle) goes beyond the central equilibrium at which 
the phase is not defined. 

The stroboscopic mapping of S! to itself, called 
Poincaré phase map, 


9,1 = PTC(0,) [5] 


describes the response of an oscillator to a T-periodic 
pulse train. Here, V, denotes the phase of oscillation 
when the kth input pulse arrives. Its fixed points 
correspond to synchronized solutions, and its periodic 
orbits correspond to phase-locked states. 


Weak Coupling 


Now consider dynamical systems of the form 


X = f(x) + es(t) [6] 


describing periodic oscillators, x=f(x), forced by 
a weak time-depended input es(t), for example, from 
other oscillators in a network. Let O(x) denote the 
phase of oscillation at point x € U, so that the map 
O:U — S! is constant along each isochron. This 
mapping transforms [6] into the phase model 


8 = Q+ £Q(0) - s(t) 


with function O(9), illustrated in Figure 6, satisfying 
three equivalent conditions: 


van der Pol oscillator 
1 
f Q() 


Andronov—Hopf oscillator 


O 0 Q 0 


0 Phase,0 2m 0 


Phase,0 2x 


Figure 6 Solutions Q— (Qi, 
oscillators in Figure 3. 


Q2) to the adjoint problem [7] for 


1. Winfree: O(9) is normalized PRC to infinitesimal 
pulsed perturbations; 


2. Kuramoto: Q(2) = grad O(x); and 
3. Malkin: O is the solution to the adjoint problem 
O = —{Df(7(t))}'Q 7] 
with the normalization O(t) - f(y(t)) = for any t. 


The function O(? 
few simple cases: 


) can be found analytically in a 


1. a nonlinear phase oscillator x= f(x) with x € s! 
and f > 0 has Q(9) =0/f(y(0)) 

2. a system near saddle-node on invariant circle 
bifurcation has O(9) proportional to 1 — cos à; 
and 

3. a system near supercritical Andronov-Hopf 
bifurcation has O(0) proportional to sin(Y — 4), 


where 4 € S! is a constant phase shift. 


Other interesting cases, including homoclinic, 
relaxation, and bursting oscillators are considered 
by Izhikevich (2006). 

Treating s(t) in [6] as the input from the network, 
we can transform weakly coupled oscillators 


si(t) 
PA A 
n 


di = filxi) +e) muxo xj), er” [8] 
j=l 


to the phase model 
si(t) 


n 


à; = Q; + € Oí(0)) - Y glili), x,(0,)) O 


j=l 
having the form [4] with 5; — O; >> gj, or the form 
0; = UY +e X bi(0;,0;) 
j=1 


where hj = Q;gij. Introducing phase deviation vari- 
ables Y; — Q;t + pi, we transform this system into the 
form 


n 
(Oj = E ` hj(Q;t + Yj, t T pj) 
j=l 


which can be averaged to 


med Hilo e) 10 


with the functions 


Hio) = Jim Ef hi (Oit, jt —x)dt — [11] 


describing the interaction between oscillators 
(Ermentrout and Kopell 1984). To summarize, we 
transformed weakly coupled system [8] into the 
phase model [10] with H given by [11] and each O 
being the solution to the adjoint problem [7]. This 
constitutes the Malkin theorem for weakly coupled 
oscillators (Hoppensteadt and Izhikevich 1997, 
theorem 9.2). 

Existence of one equilibrium of the phase model 
[10] implies the existence of the entire circular 
family of equilibria, since translation of all y; by a 
constant phase shift does not change the phase 
differences p; — p; and hence the form of [10]. This 
family corresponds to a limit cycle of [8], on which 
all oscillators have equal frequencies and constant 
phase shifts, that is, they are synchronized, possibly 
out of phase. 

We say that two oscillators, i and j, have resonant 
(or commensurable) frequencies when the ratio 
Q;/Q; is a rational number, for example, it is p/q 
for some integer p and q. They are nonresonant 
when the ratio is an irrational number. In this case, 
the function Hj defined above is constant regardless 
of the details of the oscillatory dynamics or the 
details of the coupling, that is, dynamics of two 
coupled nonresonant oscillators is described by an 
uncoupled phase model. Apparently, such oscillators 
do not interact; that is, the phase of one of them 
cannot change the phase of the other one even on 
the long timescale of order 1/e. 


Synchronization 


Consider [8] with »=2, describing two mutually 
coupled oscillators. Let us introduce “slow” time 
7=et and rewrite the corresponding phase model 
[10] in the form 


pı = wi + Helpi — pa) 

ys = w + H21(¢2 — p1) 
where '—d/dr and w;=H,;(0) is the frequency 
deviation from the natural oscillation, i= 1, 2. Let 


x = 2 — Yı denote the phase difference between the 
oscillators; then 


X =w+ A(x) [12] 
where 
w = w — w, and H(x) = Ha (x) — Hi2(—x) 


is the frequency mismatch and the antisymmetric 
part of the coupling, respectively, illustrated in 
Figure 7, dashed curves. A stable equilibrium of 
[12] corresponds to a stable limit cycle of the phase 
model. 
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van der Pol oscillator 


Andronov—Hopf oscillator 


Phase difference, x Phase difference, x 


Figure 7 Solid curves: functions H;(x) defined by [11] 
corresponding to the gap-junction input g(X;, xj) — (X1 — Xj, 0). 
Dashed curves: functions H(x) — Hj(x) — Hj(—x).. Parameters 
are as in Figure 3. 


All equilibria of [12] are solutions to H(x)= —w, 
and they are intersections of the horizontal line —w 
with the graph of H. They are stable if the slope 
of the graph is negative at the intersection. If 
oscillators are identical, then H(x) is an odd 
function (i.e., Hi=x)= —H(x), and x=0 and 
x=" are always equilibria, possibly unstable, 
corresponding to the in-phase and antiphase syn- 
chronized solutions. The in-phase synchronization 
of gap-junction coupled oscillators in Figure 7 is 
stable because the slope of H (dashed curves) is 
negative at x —0. The max and min values of the 
function H determine the tolerance of the network 
to the frequency mismatch w, since there are no 
equilibria outside this range. 

Now consider a network of n > 2 weakly coupled 
oscillators [8]. To determine the existence and 
stability of synchronized states in the network, we 
need to study equilibria of the corresponding phase 
model [10]. The vector 6ó-—(ó;,...,0,) is an 
equilibrium of [10] when 


0=wi+ > Hj(ó;— 6j) (foralli) [13] 
¡A 
It is stable when all eigenvalues of the linearization 
matrix (Jacobian) at $ have negative real parts, 
except one zero eigenvalue corresponding to the 
eigenvector along the circular family of equilibria (ó 
plus a phase shift is a solution of [13] too since the 
phase shifts 4; — ¢; are not affected). 
In general, determining the stability of equilibria 
is a difficult problem. Ermentrout (1992) found a 
simple sufficient condition. If 


1. aj = H;(ói — 6) € 0, and 

2. the directed graph defined by the matrix a= (aj) 
is connected, (i.e., each oscillator is influenced, 
possibly indirectly, by every other oscillator), 


then the equilibrium ó is neutrally stable, and the 
corresponding limit cycle x(t + @) of [8] is asympto- 
tically stable. 
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Another sufficient condition was found by 
Hoppensteadt and Izhikevich (1997). If system [10] 
satisfies 


1. wy = +++ =wWy =w (identical frequencies) 
2. Hi(—x) = —Hyi(x) (pairwise odd coupling) 


for all i and j, then the network dynamics converge to a 
limit cycle. On the cycle, all oscillators have equal 
frequencies 1 + ew and constant phase deviations. 

The proof follows from the observation that [10] 
is a gradient system in the rotating coordinates 
p=wT + ¢ with the energy function 


where 
Rio) = 


One can check that dE(¢)/dr= -S (dy <0 
along the trajectories of [12] with equality only at 
equilibria. 


Mean-Field Approximations 


Let us represent the phase model [10] in the form 


n 
p; — wit  ' Hi(ei — pj) 
¡Al 
where '—d/dr,r—et is the slow time, and 
wi = H;;(0) are random frequency deviations. Collec- 
tive dynamics of this system can be analyzed 
in the limit z— oo. We illustrate the theory 


using the special case, H(x) = —sin x, known as the 
Kuramoto (1984) model: 


K œ 
MEIST — S si ‘= 193), j 27 14 
P; = Wi + e ,"mig yi), vi € [0,27] [14] 


where K > 0 is the coupling strength and the factor 
1/n ensures that the model behaves well as n — oo. 
The complex-valued sum of all phases, 


"E 
re” = — ) ei 
n 


(Kuramoto synchronization index) [15] 


describes the degree of synchronization in the 
network. Apparently, the in-phase synchronized 
state p1 —-::: =, corresponds to r=1 with y 
being the population phase. In contrast, the inco- 
herent state with all y; having different values 


Figure 8 Kuramoto synchronization index [15] describes the 
degree of coherence in the network [14]. 


randomly distributed on the unit circle, corresponds 
to r = 0. Intermediate values of r correspond to a 
partially synchronized or coherent state, depicted in 
Figure 8. Some phases are synchronized forming a 
cluster, while others roam around the circle. 

Multiplying both sides of [15] by e^? and 
considering only the imaginary parts, we can rewrite 
[14] in the equivalent form 


p; = wj + Krsin(V — y;) 


that emphasizes the mean-filed character of interac- 
tions between the oscillators: they all are pulled into 
the synchronized cluster (p; — w) with the effective 
strength proportional to the cluster size r. This pull 
is offset by the random frequency deviations w; that 
pull away from the cluster. 

Let us assume that w,’s are distributed randomly 
around O with a symmetrical probability density 
function g(w), for example, Gaussian. Kuramoto has 
shown that in the limit n — oo, the cluster size r 
obeys the self-consistency equation 


+7/2 

f= rk | g(Krsin y) cos? y dy [16] 
—mn/2 

Notice that 7— 0, corresponding to the incoherent 

state, is always a solution of this equation. When the 

coupling strength K is greater than a certain critical 

value, 


2 


Nc = g(0) 
an additional, nontrivial solution r » 0 appears, 
which corresponds to a partially synchronized 
state. Expanding g in a Taylor series, one gets the 
scaling r=,/16(K — K.)/(-g"(0)7K?*). Thus, the 
stronger the coupling K relative to the random 
distribution of frequencies, the more oscillators syn- 
chronize into a coherent cluster. The issue of stability 
of incoherent and partially synchronized states is 
discussed by Strogatz (2000). Other generalizations 
of the Kuramoto model are reviewed by Acebron et al. 
(2005). An extended version of this article with the 


emphasis on computational neuroscience can be found 
in the recent book by Izhikevich (2006). 


See also: Bifurcations of Periodic Orbits; Dynamical 
Systems and Thermodynamics; Hamiltonian Systems: 
Stability and Instability Theory; Singularity and Bifurcation 
Theory; Stability Theory and KAM; Synchronization of 
Chaos. 
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Introduction 


It is recognized that one of the outstanding problems 
in modern physics is to formulate the quantum 
theory of gravity, synthesizing the principles of 
quantum mechanics and general theory of relativity. 
The fundamental units for measuring time, length, 
and energy, known as Planck time, Planck length, 
and Planck energy, respectively, are defined to be 
tp —(bG/c) =5.39 x 10-48, ly (bG/c3)! = 
1.61 x 107?cm, and 
10%g, in terms of the Newton’s constant, G, 
velocity of light, c, and b=h/27,h being the 
Planck's constant. We may conclude, on dimen- 
sional arguments, that quantum gravity effects will 
play an important role when we consider physical 
phenomena in the vicinity of these scales. Therefore, 
when we probe very short distances, consider 
collisions at Planckian energies, and envisage evolu- 
tion of the universe in the Planck era, the quantum 
gravity will come into play in a predominant 
manner. The purpose of this article is to present an 
overview of an approach to quantize Einstein's 
theory of gravity, pioneered by Wheeler and De 
Witt almost four decades ago. We proceed to 
recapitulate various prescriptions for quantizing 
gravitation and then discuss simple derivation of 
the Wheeler-De Witt (WDW) equation in general 


mp — (bc/ G)? =2.17 x. 
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relativity and some of its applications in the study of 
quantum cosmology. There are, broadly speaking, 
three different approaches to quantize gravity. 

The general theory of relativity has been tested to 
great degree of accuracy in the classical regime. The 
geometrical description of spacetime plays a cardinal 
role in Einstein's theory. Therefore, the general 
relativists emphasize the geometrical attributes of the 
theory and the central role played by the spacetime 
structure in their formulation of quantum theory. 
It is natural to adopt a background-independent 
approach. In contrast, the path followed by 
quantum field theorists, where the prescription is 
valid in the weak-field approximation, the theory is 
quantized in a given background, usually the Min- 
kowskian space. It is argued by the proponents of the 
geometric approach, that the background metric 
should emerge from the theory in a self-consistent 
manner rather than being introduced by hand when 
we quantize the theory. One of the earliest attempts 
to quantize gravity was to follow the route of 
canonical method. The canonical quantization 
approach has many advantages. One of the impor- 
tant features is that it is quite similar to the 
prescriptions adopted in quantum field theory where 
one uses notion of operators, commutation relations, 
etc. Moreover, the subtleties encountered in quantiz- 
ing gravity are transparent. Therefore, the canonical 
procedure is preferred over the path-integral formula- 
tion, although the latter has its own advantages too. 
Another positive aspect of the canonical approach is 
that the requirement of background-independent 
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formulation could be maintained to some extent. 
Thus, there is room for exploring some of the 
nonperturbative attributes of the theory. The relati- 
vists favor canonical formulation, since some of the 
geometrical features of general theory of relativity 
could be incorporated here and be explored to see 
how far the quantum theory captures such properties 
of the classical theory. As we shall discuss in sequel, 
some of the interesting issues of quantum cosmology 
are addressed in this approach. However, there are 
limitations and short comings in this formulation and 
we refer the reader to the text books and review 
articles for further reading and critical assessments of 
canonical approach to quantize gravity. 

The second approach is primarily the endeavor of 
physicists who have devoted their research to 
quantum field theory. Feynman’s seminal work on 
quantization of gravity from this perspective has 
profoundly influenced the subsequent developments. 
The quantization of gravity is carried out in the 
weak-field approximation such that the graviton is 
identified as the fluctuation over the Minkowski 
background metric. It is a massless spin-2 field as one 
concludes from the properties of low-energy gravita- 
tional interaction in the classical limit. Furthermore, 
the gauge invariance associated with a spin-2 mass- 
less field gets intimately related with invariance of 
Einstein’s theory under general coordinate transfor- 
mation. In this setup, the field-theoretic techniques 
could be employed to quantize theory and to consider 
perturbative expansions for the scattering amplitudes. 
It is realized that low-energy amplitudes computed 
from the massless spin-2 theory match with those 
derived from the Einstein-Hilbert action in the weak- 
field approximation. Furthermore, the theory is not 
perturbatively renormalizable since the coupling 
constant carries dimension. One of the most impor- 
tant outcomes of the investigations from this per- 
spective is the discovery, due to Feynman, that the 
introduction of ghost fields is necessary in order to 
maintain unitarity of the S-matrix when one goes 
beyond the tree level. As is well known, this work has 
profoundly influenced frontiers of research in physics 
leading to quantization of Yang-Mills theory which, 
in turn, paved way for electroweak theory and the 
QCD. It is worthwhile to mention in passing that the 
quantum phenomena associated with gravity in the 
nonperturbative regime cannot be addressed in this 
framework. 

In recent years, superstring theory has been at the 
center stage in order to provide a unified theory of 
fundamental interactions. It is postulated that all 
elementary constituents of matter and the carriers of 
the interactions such as gauge bosons and graviton 
are excitations of one-dimensional extended objects: 


the strings. The superstring theories are perturba- 
tively consistent in critical ten dimensions. The 
closed-superstring spectrum contains a spin-2 mass- 
less state which is identified to be the graviton. It is 
well known that perturbative computation of pro- 
cesses involving graviton turn out to be finite. 
Moreover, the Einstein-Hilbert term appears natu- 
rally when one derives the string effective action. 
Therefore, it is expected that string theory will be 
able to provide answers to questions related to 
quantum gravity. Indeed, the theory has met with 
success in resolving some important issues. We note 
that cosmological scenario has been discussed in the 
string theory framework and the WDW equation 
has played an important role in study of quantum 
string cosmology. We shall comment on this aspect 
towards the end of this article. 


The Canonical Structure of Einstein 
Gravity 


The Einstein—Hilbert action is 


one 4 
2% | "id x(R — 2A) [1] 


where R is the Ricci scalar derived from the metric, 
guy, and A is the cosmological constant. The field 
equations are derived from the action by the 
standard variational technique. Note that R involves 
second derivative of the metric. If we have compact 
manifolds with boundary M such that variations of 
the metric vanish on the boundary and the normal 
derivatives do not, it is necessary to add a surface 
term to this action. The exact form of this term will 
be discussed later. The Einstein’s theory of gravita- 
tion is manifestly covariant. The associated action 
[1] is invariant under general coordinate transforma- 
tions: under x^" — x'" (x), 
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Therefore, we expect that the theory will be 
endowed with constraints expressed in terms of the 
canonical variables. One can implement general 
coordinate transformations so that there are only 
two pairs of canonical phase-space variables on a 
spacelike hypersurface. In other words, from physi- 
cal considerations, graviton has only two polariza- 
tions whereas the metric has ten components. 
Therefore, the two physical degrees of freedom can 
be obtained using the freedom of choosing the 
“gauge” transformations in this context. It is 
desirable to identify the constraints and analyze 
their structure, most appropriately in  Dirac’s 


2j 


formalism, and to quantize the theory canonically as 
the next step. This is the path we intend to follow in 
order to arrive at the WDW equation. 


The Classical Constraints 


The Hamiltonian approach is most appropriate to 
employ the constraint formalism due to Dirac. We 
recall that the Lagrangian formulation is manifestly 
covariant as is reflected in the field equations; 
whereas the spacetime covariance is lost in the 
passage to the Hamiltonian approach. Furthermore, 
the spatial components of the metric are the 
dynamical degrees of freedom. We adopt the 
formalism introduced by Arnowitt, Deser, and 
Misner (ADM) for the so-called 3+ 1 split of the 
hyperbolic Riemannian spacetime metric, guv. One 
introduces the lapse function, N+, and the shift 
function, N'. We suppress the factors of 1/167G, 
etc., for the time being for the general discussions 
and shall reintroduce them later. The family of 
spacelike hypersurfaces, %,, are constructed, with 
metric b; induced on it. Here ¢ is a timelike 
parameter, parametrize X;. The distance between 
points on two neighboring hypersurface, X, and 
Did» With coordinates (f, x^) and (t+ dt, x' + dx’), 
respectively, is given by 


ds? = —(N+)* de? + b;(N! dt + dx) (N dt + dx’) 

= gu dx dx" [3] 
The indices of tensors defined on X, are raised and 
lowered by hb; and its inverse þh”. The relations 


between the components of g,, and N+, N', b; can 
be obtained easily, 


goo = bij N'Ni — (N+), go; —bgN' [4 
The above relations can be inverted to give 
1 
"m 
The relations between spatial components, gi, of g,; 


and hy and some other useful relations are listed 
below for later conveniences: 


N'- pg; | Ni -— 


[5] 


x i NN! 
f 
V78 = N+vh [6] 
Or __ N’ 
Say 


Note that (N^, N') are introduced to specify the 
deformation of the hypersurface and therefore, the 
evolution equations through the Hamiltonian will 
not determine them; they are arbitrary functions. 
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Consequently, [4] implies that goo and go; will enter 
the Hamiltonian as arbitrary functions. As alluded 
to above, 5j; and their conjugate momenta 7 are the 
dynamical degrees of freedom. We may choose 
(N-,N')- N" and bj as independent variables 
rather than (goo, goi) — go, and bj; for convenience 
and go back to the other set of variables through [4] 
and [5] if we desire. Let m, be canonically conjugate 
momenta to N^, then it is obvious that a Lagrangian 
multiplier, x^, is necessary so that z.x term has to be 
supplemented to the Hamiltonian due to the 
arbitrariness of N", We remind the reader that in 
electrodynamics an analogous situation arises while 
analyzing its canonical structure — local gauge 
symmetry plays a crucial role there. It is obvious 
that the generic form of the Hamiltonian is (we shall 
introduce 1/167G, etc., later) 


H= | &àx(N^ni [by n] + NP bun] +x) [7] 


From the perspective of constraint analysis, it is 
natural that 7^ ~ 0 appears as a first-class constraint 
as they are multiplied by arbitrary functions. More- 
over, this constraint must hold good under the 
deformation of the surface which implies (z^, H]pg 
must vanish weakly leading to 74,70. As a 
consistency requirement, these must be first-class 
constraints if N^ are to be arbitrary functions. We 
identify that 7“ zz 0 and H, z 0 are the primary and 
secondary constraints, respectively. Thus far, we 
have discussed the case for pure gravity; the 
presence of matter fields in the full action modifies 
the treatment appropriately. 

Let us analyze the structure of the constraints for 
the Einstein-Hilbert action. [1]. For a compact 
manifold with boundary 0M, we have to add the 
surface term which takes the form: 


1 


3 
bK 
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Here K stands for the trace of the extrinsic curvature 
of the boundary 3-surface and h= det hj; note that 
bij is the induced metric on the 3-surface. If we 
include matter fields, the corresponding action is to 
be taken into account. Once we make the 3 + 1 split 
of the metric, the action assumes the following form: 


a > J dx diN+ Vb 


E 167G 
x (KjK" — K? +°R— 2A) [8] 
where 
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Here D¡N; represents covariant derivative of N; with 
the connections computed from bh; and ?R is 
curvature of the 3-surface. The canonical momenta 
are 


| vh 
— 167G 


and we can invert this relation to get 


, 1 5 j ES 
K! = —— [ 49 — —p! ) 
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The Hamiltonian form of action is given by 


j 


(Ki = WKI) [10] 


Sy = / dix dt (bin? -N.-— N^H;) [11] 


Notice that [8] does not involve time derivatives of 
N+ and N', their corresponding canonical momenta 
vanish. 


T ^d 0 [12] 


as expected from our earlier discussions about the 
role of N^. A straightforward constraint analysis 
leads to the pair of constraints 


Hi = —2D;n ~ 0 [13] 
1 xd 
Fl = (ib an 3 Pahn) rR gil 
Vh_3 
167G ey [14] 


We mention in passing that the above constraint 
equations get modified in the presence of matter 
fields in the theory. This is relevant. The WDW 
equation plays an important role in quantum 
cosmology to describe the evolution of the universe 
in early epochs and the equation is studied in the 
presence of a generic matter content, that is, a scalar 
field with potential. The constraint equations [13] 
and [14] modify to 


HT =H; + HPA ce [15] 


HT =H, +H es 0 [16] 


The Algebra of Constraints 


In order to compute the classical Poisson bracket 
algebra of the constraints [13] and [14], we use the 
canonical Poisson bracket relations for the phase- 
space variables on >;: 


(bj), h(x) } = 0 17] 


(7 (x), n“ (x')} = 0 [18] 
(oix), n" (x^) = 65,656 (x, x’) [19] 


Thus, Poisson brackets among the constraints [13] 
and [14] are 


{Hi(x), Hj(x')} = —Hy(x)OF 6(x, x’) 


{Hi(x), Hi (x')} = Hi(x)OFs(x,x’) [21] 


Ha (x), Ha (x?)) = b (x) Hi(x) OF 6(x, x") 
— bi (x yHi(x)8ré(x,x) [22] 


When we resort to canonical quantization, the 
starting point is the Hamiltonian action in the first- 
order formalism, where the canonical variables are 
subjected to the constraints [13] and [14] in terms of 
H and H; satisfying the algebra given by [20]-[22]. 
One encounters a number of important issues while 
proceeding to canonically quantize the theory. We 
shall mention only a few of them in what follows. It 
is important to address issues related to the role of 
the constraints in the quantized theory and how to 
deal with the Lagrange multipliers N+ and N'. A 
simple proposal is to solve the constraints at the 
classical level and identify the physical degrees of 
freedom and quantize the theory subsequently. 
There are four constraints (first class), 7?1,,7j, 
therefore, out of the 12 phase-space variables, 
(hi n’), only eight are independent. We need to 
supply four gauge conditions in order to render the 
theory (classically) solvable. Thus, we are left with 
four physical degrees of freedom in the Hamiltonian 
phase space and we can quantize them. The 
implementation of this idea is easier said than 
done. One obstacle is that the constraints cannot 
be solved in a closed form in this formalism. If we 
fix a gauge and quantize the theory, we obviously 
break the gauge invariance. It is essential to show, 
subsequently, that all physically observable quanti- 
ties are independent of the gauge choice. Another 
criticism of this formalism is that we already get rid 
of some of the components of the metric. Therefore, 
the spirit of the general theory of relativity, which is 
based on the geometrical structure of spacetime, is 
somewhat diluted. There are other suggestions 
where / and their conjugate momenta are elevated 
to quantum status before supplying the gauge 
conditions. The issues of gauge fixing and dealing 
with the constraints are addressed at the quantum 
level. We replace the canonical Poisson bracket 


algebra by the canonical commutators and proceed 
further. The momentum operator assumes the form 


and the wave functional depends on h;; that is, W[»]. 
There are many technical problems related to the 
properties of the states and we shall not deal with 
them due to limitations of space. It is essential to 
discuss the role of the constraints in the quantum 
theory. We demand that the quantum constraints 
annihilate the physical states (recall the Gauss law 
constraint in gauge theories). However, the issue of 
operator ordering is to be dealt with which in turn is 
connected with the Hermiticity properties of the 
quantum constraints. The Hamiltonian constraint 
Hı ~ 0 (henceforth denoted as H and defined as the 
Hamiltonian) is a product of the metric h; and a. 
There is certain ambiguity in defining the constraint. 
Therefore, one has to choose a convention. 
The condition that the Hamiltonian, H , consisting 
of gravitational and matter components, annihilates 
the state is expressed as 


Hv =0 [23] 


When we adopt coordinate representation for a, 
the above equation takes the form 


Ó ó 
—16rG G ijkl óh; &bpı 
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3 matter 
= R = ZA Vb = 24 
SZ CR-2A) +H [wig] =0 [2A 
This is the celebrated WDW equation. Here we have 
considered a simple case where matter Hamiltonian 
density generically contains a single scalar field, ¢, 


and therefore Y is functional of 3-metric on Y, and ` 


o. Giz) is the De Witt metric in the superspace: 


1 
Gijki = N^ (highjl + habig — biby) [25] 
Remarks The space of all 3-metrics and the scalar 


field (hb, 9), on X, for the description of classical 
evolutions is called the superspace (no connection 
with the superspace of supersymmetry). Thus, 
WI», ¢] is a functional on superspace. Furthermore, 
V carries no explicit dependence on t. This is a 
consequence of the fact that “time” plays the role of 
a parameter in the general theory of relativity, thus 
the dynamical variables 5;; and ¢ already provide the 
evolutionary processes although £ does not make its 
appearance. As mentioned earlier, we always discuss 
the case when X, is compact. Another point to note 
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is that the quantum momentum constraint, H;, as an 
operator annihilates the wave function which is a 
statement of the quantum-mechanical invariance of 
the theory under three-dimensional diffeomorph- 
isms. However, the WDW equation conveys invar- 
lance of the theory under reparametrization, 
although careful analysis is necessary to prove this 


point. Now we proceed to discuss the solutions of 
the WDW equation. 


WDW Equation and the Solutions 


It is recognized that the WDW equation [24] is a 
second-order hyperbolic functional differential equa- 
tion and naturally it has enormous number of 
solutions. Therefore, if we want the WDW equation 
to have any predictive power, it is necessary to 
introduce boundary conditions. One of the possible 
choice is to specify the wave function on the 
boundary of the superspace. Indeed, the central 
issue of quantum cosmology is about the choice of 
various boundary conditions which has been an 
important topic of debates. This point will be briefly 
discussed later. Notice that the boundary condition 
has to be introduced keeping in mind how the 
universe is expected to behave as it evolves. There is 
a proposition that the boundary condition for the 
quantum evolution of the universe be given the 
status of a physical law. Therefore, the role of the 
wave functional, V [5;(x), ó(x), B], its evolution, and 
interpretation are central to the development of 
quantum cosmology. Thus, V represents the ampli- 
tude for the universe to have h;;(x) on the 3-surface, 
B, and matter field ¢(x). It is argued that path- 
integral formalism should be adopted as an alter- 
native to the canonical prescription to solve for the 
wave function, rather the transition amplitude, 
satisfying the WDW equation. Here the first step is 
to define the Euclidean version of the gravitational 
action keeping in mind the subtleties. As is well 
known, we deal with propagator (or transition 
amplitude) in the path-integral approach where the 
functional integral is carried out over a set of 
4-metrics and .matter fields with Euclidean action 
inside the integral acting as the weight factor. We 
recall that while formulating quantum mechanics in 
the path-integral approach, we sum over all possible 
paths in the functional integral. However, in the 
semiclassical approximation, the amplitude is domi- 
nated by the action corresponding to the classical 
path and we approximate the wave function as v ~ 
e/P)5a and it gets modified appropriately in the 
Euclidean formulation. In this background, we 
briefly discuss how the wave function of the 
universe is obtained in the path-integral formalism. 
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According to the proposal of Hartle and Hawking, 
one adopts path-integral formalism for the Eucli- 
dean action where the functional integral is not only 
carried out over the 4-metric, g,,, and the scalar 
field ¢, but also one takes sum over the class of 
manifolds, M. Note that B is a part of the boundary 
of this set of manifold. If 4; and $ are the induced 
metric and the configuration of the scalar field, œ, 
on the boundary, B, then the propagator (henceforth 
we just call it the wave function) V[»;;, 4, B] can be 
given a functional-integral representation. Indeed, 
obtaining the most general form of the path integral, 
summing over the 4-manifolds, is quite a formidable 
task. On the other hand, if one chooses a class of 
4-manifolds which can be decomposed as a product 
(foliation) R x B, the wave function is expressed as 


wh, $, B] 
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We have introduced the gauge-fixing condition 
as f(N"), which is usually taken to be N" — ^ and 
then the corresponding Faddeev-Popov determinant, 
Arp, has to be inserted into the path-integral 
measure. We recall from our earlier discussions 
that N” has to be unrestricted on the boundary, B, 
since they have no dynamical role when we express 
the action in terms of the variables defined on the 
3-surface. As noted in the previous discussion, 
explicit time dependence does not appear after the 
3 4- 1 split and (h;(x), p(x)) have no dependence on 
t. Therefore, we introduce a parameter to designate 
the paths over which the functional integral is to be 
taken. Recall that in the quantum-mechanical case, 
the paths are parametrized as q;(t) for the coordi- 
nates. However, when we resort to a parametriza- 
tion of the variables for the case at hand, certain 
conditions must be fulfilled. We are permitted to 
integrate over hj; and ¢ over only those paths, while 
parametrizing them as (h;;(x,7), d(x, 7)), so that they 
match the arguments of the wave function on the 
boundary B. Therefore, we may define the metric 
and the scalar field configuration so that at 7—1 
they assume their functional values on the boundary: 
in other words, hj(x)=hj(x,r=1) and 
o(x) = d(x, 7 =1). It is worthwhile to go back to 
the quantum-mechanical analogy once more. When 
we compute amplitudes/propagators in quantum 
mechanics, the functional integral is defined for the 
amplitude of going from a configuration q; to qf 
while summing over all possible paths originating 
from one endpoint q; and ending at the final point 
qf. On this occasion, we have imposed the con- 
straint on the final endpoint belonging to the 


boundary B. Thus, in order to determine the wave 
function of the universe, we are required to specify 
the initial configurations of h; and ¢ at 7 —0. We 
shall not enter into important issues related with the 
properties of the Euclidean action, the problems 
associated with the choice of contours of the path 
integrals, and related topics. The reader will find 
detailed discussions in the lectures and monographs 
referred in the *Further reading" section. 

It is important to re-emphasize that boundary 
conditions are to be introduced while solving the 
WDW equation. It was argued by De Witt that the 
wave function will be determined uniquely from the 
mathematical consistency of the theory and that 
hope has not been realized. Whether one attempts to 
solve the functional differential WDW equation or 
obtain the wave function in the path-integral 
formalism, the issue of boundary condition is 
unavoidable. There are mainly three different kinds 
of boundary conditions in quantum cosmology: 
Hartle-Hawking (HH)  no-boundary proposal, 
Vilenkin's tunneling mechanism, and Linde's bound- 
ary condition. We shall briefly discuss the first two 
proposals. Instead of stating the boundary condi- 
tions in full generality, we shall envisage quantum 
cosmology in a minisuperspace and provide illus- 
trative examples to compare the main features of 
HH and Vilenkin solutions to the WDW equation. 

It is realized that the discussion and solutions of 
quantum cosmology in the superspace is rather 
difficult, since we deal with functional differential 
equations and the configuration space is infinite 
dimensional. Therefore, it is worthwhile to consider 
a system, as a simple model, which has finite degrees 
of freedom. Thus, we assume that the metric and 
matter fields depend only on cosmic time to begin 
with. There is a physical motivation behind this 
assumption, since the present classical state of the 
universe is described by the Friendmann-Robertson- 
Walker (FRW) metric corresponding to an isotropic 
and homogeneous universe. Notice that the classical 
evolution equation resembles that of the motion of a 
particle. The quantum evolution equations are now 
given by differential equations of quantum 
mechanics rather than functional differential equa- 
tions. Similarly, the  path-integral formulation 
becomes analogous to the quantum-mechanical 
frame work. Of course, adopting such a simplified 
approach deprives us from describing some of the 
important aspects of quantum gravity. However, 
within this framework, several essential features can 
be exhibited and deep insight might be gained into 
the physics of the very early universe. The first step 
in getting the minisuperspace metric is to assume 
that the lapse is homogeneous, that is, N+ — N-(t) 


and the shift is set to zero, N'=0. Thus, the metric 
takes the form 


ds? = (N+ (t) dt? + b;(x,t)dxídx! ^ [27] 


The relevant choice of 3-metric for FRW isotropic 
and homogeneous universe 1s 


bj; (x, t)dx'dx! = a(t)^d02 [28] 


Note that dO is the metric on a 3-sphere. It is 
straightforward to derive the Friedmann equations 
for such a geometry. 

The HH no-boundary condition can be inter- 
preted as a topological proposition about the set of 
path over which we have to sum. The 3-surface B is 
to be taken as the only surface of compact 
4-manifold M which is endowed with the metric 
guu, and hi and $ are the induced metric and the 
scalar field on the surface. The wave function is 
obtained by using the matching condition supple- 
mented with initial condition. For the minisuper- 
space case, initial conditions impose constraints on 
the scale factor a(r — 0) and (da/dr)(r — o), and N+ 
is to be gauge fixed. These conditions are to be 
implemented in the context of determining the wave 
function of the universe. In the case of the tunneling 
boundary condition of Vilenkin, the qualitative 
scenario is as follows. If we look at the solution 
to the WDW equation (in the path-integral 
approach, Vilenkin considers Lorentzian action), 
the solution, crudely speaking, has both ingoing 
and outgoing modes at the boundary. In his 
proposal, the outgoing mode at the boundary is to 
be accepted. The exact prescription is lot more 
subtle than the above statement, since one has to 
define the meaning of outgoing mode carefully in 
the absence of a timelike Killing vector when we 
write the WDW equation on the superspace. The 
qualitative picture for Vilenkin's boundary condi- 
tion, in the minisuperspace, is like tunneling solu- 
tions in quantum mechanics when a particle 
penetrates through a potential barrier. 

Let us consider a minisuperspace model with the 
scalar field and potential V(¢). The action is 


fal 1 (à (9) 
$-5 | dea -x (2) +S 


A few comments are in order here. For the FRW 
metric, we have ,/gR=6(—aa+ka)+ a total 
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derivative term; the total derivative term can be 
removed by adding a boundary term and k is 
positive since we take the spatial part to be closed. 
We have redefined the scale factor, the scalar 
field, the potential term, and k such that the 
Einstein-Hilbert action with matter field assumes 
the form of [29] and this action facilitates the 
definition of conjugate momenta without cumber- 
some numerical factors, and the Hamiltonian takes 
a simple form. The conjugate momenta and result- 
ing Hamiltonian are 


aa ad 
pi Y= a 
NL| «2 7 
H, = |--4+-$+@V(¢)-a| =N+H [31 
| 24 +a V(O) a H [31] 


and the constraint is H=0. In the quantum 
cosmology context, we solve the WDW equation: 
HV — 0. Since the exact solution is not possible, one 
resorts to some approximation with simple assump- 
tions. The differential equation is 


eo 10 , à 
bat ape V(¢) —a V =0 [32] 


Let us consider the case when V(¢) does not grow 
very fast, that is, V(o)/V(ó)' << 1 and consider the 
solution to the WDW equation where V has weak 
dependence on ¢. Consequently, we may ignore the 
@ derivative term in [32]. The purpose of these 
assumptions is to reduce the problem to a one- 
dimensional quantum mechanics problem and then 
employ WKB method. It is hoped that at least some 
of the nonperturbative aspects can still be captured. 
When the effective potential appearing in [32] is 
negative (this is a classically inaccessible region), the 
wave function is 


V (a, à) = e*/3v(9) 0-2 v(o)^ [33] 


We expect the wave function have oscillatory 
behavior in the classically allowed domain and it 
does have that property, 


V (a, p) x EVNE- [34] 


The choice of signs is decided from the boundary 
conditions imposed and the usual matching of 
the wave functions of the two regions is done as is 
the case with the WKB approximation. Note that we 
are considering the metric and the scalar field on 
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the boundary which were denoted by bi; and ó; 
strictly speaking, we should denote the solutions 
as a and ¢. But from now on, we drop this bar on 
a and 4. 

Let us momentarily assume that V is ¢-indepen- 
dent and therefore, we have an effective cosmologi- 
cal constant. The problem is identical to the motion 
of a particle in a potential well. There are two 
turning points. In one region, the particle starts from 
a — 0, reaches one turning point rı and returns back. 
In another case, it starts from a — oo, travels up to 
a — rj and reflects back. In the quantum-mechanical 
case, the particle can tunnel through the barrier. The 
wave function has both decaying and growing 
modes under the barrier, and boundary conditions 
tell us which mode to choose. One possibility is that 
the particle starts from a —0, tunnels through and 
proceeds towards a=00, that is, it has outgoing 
mode. The other possibility is that the wave function 
has both outgoing and ingoing modes. In this simple 
scenario, the former corresponds to Vilenkin's 
tunneling boundary condition, where the universe 
is created at a — 0 and it keeps growing. The latter is 
HH no-boundary proposal where the wave function 
has both modes and the universe contracts and 
expands. 

Now we discuss the two boundary conditions in 
the presence of the potential, with the approxima- 
tions mentioned above. The proposition of Vilenkin 
amounts to the following conditions on the wave 
function: the region of the boundary which is 
nonsingular is @ finite and a— 0. Other than this 
domain, either a or $ diverge on any other region of 
the boundary; both can diverge in this singular 
boundary. Notice from the expression for [33] and 
[34] that the tunneling region corresponds to 
a?V(ó) « 1, whereas, the oscillatory domain is 
a? V(à) > 1. If we use the saddle-point approxima- 
tion, V zz ea, Vilenkin's boundary condition cor- 
responds to V = e-92, with 


c, — Ve V(9) = 1) 
"  3V(6) 


So far, we considered the situation where differential 
operator for ó is dropped in [32]. In order to 
account for weak ¢-dependence, we could introduce 
it by multiplying a slowly varying function, say F(¢) 
and write W(a,¢) ~ F(ó)e-?«. Similarly, the wave 
function can be obtained under the barrier and 
required to satisfy WKB matching conditions. 
Furthermore, the regularity condition on the wave 
function in small scale factor limit and behavior of 
its derivative with respect to @ in that limit 
determine the form of F(ó). In summary, the 


Vilenkin boundary conditions yield the following 
wave functions: 


Va, b)y = e 1/3V(6)1- i-e VAP”) [35] 


V(a, o) y e e-1/3V(9) e-G/3V()l* V()-1^ [36] 


Note that [35] is the wave function under the barrier, 
that is, à? V(ó) < 1 in this region, whereas [36] is in 
the classically accessible domain (a? V(ó) > 1) which 
is reflected by the oscillatory character. The slowly 
varying function F(ó) ^ e '/V/? appears as the 
common factor for the wave functions in the two 
domains. 

The HH no-boundary proposal to derive the wave 
function of the universe was formulated in the 
Euclidean path-integral formalism. A considerable 
amount of attention has been focused in this area. 
We shall present the HH wave function providing 
only a sketchy argument. In the Euclidean descrip- 
tion, 4-metric is ds? =(N+)} dr? + a?(r)dQ3. The 
4-geometry should close in a regular way. If we 
make the bounding 3-space smaller and smaller, it 
can be closed with flat space. We can infer about the 
behavior of the scale factor in the limit 7 —^ 0 from 
this consideration. Furthermore, in the semiclassical 
approximation W(a,¢)~e 8; we have replaced 
(a, 0) by (a, $) as remarked earlier. Thus, our aim 
is to evaluate Sg at the saddle point. This is achieved 
by writing down the (Euclidean version) field 
equations for a and @ and the Hamiltonian 
constraint, and then solve for a(7), ó(7), and N- (7). 
Eventually, we want to eliminate N- and then 
obtain Sg. After all, the path integral is dominated 
by the classical trajectory, a(7), and one does not fix 
the gauge for N+ while solving for a. In fact, the 
lapse gets eliminated by utilizing the Hamiltonian 
constraint which involve 7-derivatives of both a and 
¢@. We mention, without going into details, that the 
classical action is not unique. One of the ways to 
visualize it is to note that the solutions obtained for 
the lapse from the Hamiltonian constraint have sign 
ambiguities. 

The classical action is 


$=- 7 (1+0-ev0P”) pm 


Note that the two solutions correspond to 3-sphere 
boundary being closed off by sections of 4-sphere. 
Moreover, the Euclidean action is negative. Hartle 
and Hawking argue that the negative sign in [37] 
gives the correct answer since the wave function 
peaks for that choice. However, there is no unanimity 


for HH argument and some authors have put 
forward a point of view that additional inputs are 
necessary to arrive at the HH conclusion about 
choosing the negative sign for Sg in [37]. We refer the 
reader to the reviews of Hartle and Halliwell for 
detailed discussions on the choice of contours for 
path integrals, subtleties involved in getting various 
solutions for the lapse and their interpretations. We 
give below the wave function under the barrier (with 
choice of negative sign in [37]): 


Vyn(a, $) z« e UNA vo) [38] 


Vun(a, p) ~ e?" 


X COS (sos la? V() — 1? — J [39] 


Remarks The wave function in [38] is obtained in 
the classically inaccessible region under the condition 
a*V(¢) < 1, and wave function [39] corresponds to 
the case a*V(¢) > 1, where the particle motion is 
permissible classically. Note the factor e!/?V(?) in the 
wave functions in both the regions and compare that 
with the Vilenkin's wave function which has the 
opposite sign. We may conclude where the wave 
function will peak for each of the two boundary 
conditions. Whereas Vilenkin's proposal implies that 
Wy(a, 6) peaks when V(¢) takes large values, HH no- 
boundary condition tells us that it peaks when 
V(ó) — 0. Furthermore, we note that Vy is complex 
and Vg is real in the oscillatory region. Although 
the debates on the merits and demerits of each of the 
boundary proposals are going on for more than two 
decades, the issue is far from being settled. In the 
absence of any experimental tests, there is no way to 
favor one boundary proposal over another. Then, 
boundary conditions do have predictions about the 
evolution of the universe after the quantum era and 
have predictions in that (classical) regime. Therefore, 
determination of the wave function with specific 
boundary conditions does have some connections 
with the laws that govern the evolution of our 
universe in the present epoch. 


It is worthwhile to dwell on the WDW equation 
from the perspectives of string theories. Indeed, there 
have been important developments to understand the 
dynamics of the universe in the string-theoretic 
framework. It is important to note the key role 
played by dilaton in string theory: (1) it is one of the 
massless states of the theory, and (2) the vacuum 
expectation value (VEV) of this field determines the 
coupling constants we hope to use in describing 
fundamental interactions. Therefore, the graviton is 
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always accompanied by the dilaton in any string- 
theoretic approach to study the universe. The duality 
symmetries are recognized to provide deep under- 
standing of the string dynamics. Therefore, the 
investigations of quantum gravity phenomena from 
the string-theory viewpoint are necessarily influenced 
by above mentioned facts. Indeed, classical cosmolo- 
gical solutions, derived from string effective action, 
have several interesting characteristics. We mention is 
passing that the WDW equation has played an 
important role to study quantum evolution equations 
in string cosmology. The choice of operator-ordering 
prescription in defining the WDW Laplace—Beltrami 
operator can be resolved by appealing to the duality 
symmetries. Furthermore, the boundary conditions 
imposed on the wave function are dictated by string 
symmetries and therefore, the resulting wave function 
has very interesting properties. The string theory has 
addressed some of the most important problems in 
quantum gravity and it has provided resolutions to 
several key issues. It is expected that string theory 
will provide answers to challenging questions in 
quantum cosmology. In summary, we have conveyed 
some of the salient aspects of the WDW equation. 
The canonical quantization technique is adopted to 
study quantum gravity in this approach. We have 
illustrated the crucial role of the constraint formalism 
due to Dirac and argued that some of the nonpertur- 
bative aspects of quantum gravity could be retained. 
In a short article of this nature, it is not possible to 
provide detailed discussion about the general deriva- 
tion of the WDW equation and discuss the role of 
boundary conditions more exhaustively. Instead, we 
presented some of the key steps in the derivation of 
the WDW equation adopting the canonical formalism 
and provided simple examples. The subject is still an 
active area of research. The interested reader may 
benefit from the bibliography. 


See also: Canonical General Relativity; Loop Quantum 
Gravity; Quantum Cosmology; Quantum Dynamics in 
Loop Quantum Gravity; Quantum Geometry and its 
Applications; Superstring Theories. 
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Introduction 


Historically, the first question where the Wulff shapes 
have appeared is the one of the formation of a droplet 
or a crystal of one substance inside another. The 
natural problem here is: what shape such a formation 
would take? The statement that such a shape should 
be defined by the minimum of the overall surface 
energy subject to the volume constraint is physically 
very natural. In the isotropic case, when the surface 
tension does not depend on the orientation of the 
surface, and so is just a positive number, the shape in 
question should be of course spherical (provided we 
neglect the gravitational effects). In a more general 
situation the shape in question is less symmetric. The 
corresponding variational problem is called the Wulff 
problem. Wulff (1901) formulated it in his paper, 
where he also presented a geometric solution to it, 
called the “Wulff construction." 

The Wulff variational problem is formulated as 
follows. Let T(n), ne S4-!, be some continuous 
function on the unit sphere S+! c R?. We suppose 
that 7 > 0, and that 7 is even: T(n) 2 7(—n). The value 
T(n) plays the role of the surface tension between two 
phases separated by the hyperplane orthogonal to the 
vector n. For every closed compact (hyper)surface 
M4-! c Rf, we define its surface energy as 


W,(M) = f rind 


where n, is the normal vector to M at s € M. The 
functional W,(M) has the meaning of the surface 
energy of the M-shaped droplet made from one of 
these two phases. It is called the Wulff functional. 
Let W, be the surface which minimizes W,(-) over 
all the surfaces enclosing the unit volume. Such a 


minimizer does exist and is unique up to translation. 
It is called the Wulff shape. 

The following is the geometric construction of 
23.. Consider the set 


Krp = fx € RÉ: Vn € S! (x,n) < r(n)} 
If we define the half-spaces 
Lo = fx € R4: (x,n) < r(m)) 


then 
Ky = lrg [1] 
In particular, K, is convex. It turns out that 
W, = X,X(K;) 


where the dilatation factor A, is defined by the 
normalization: vol(A,K,)=1. The relation [1] is 
called the Wulff construction. For the future use, 
we introduce the notation w, for the value of the 
surface energy of the Wulff shape: 


wr = W,-(20,) 


The Wulff construction was considered by the 
rigorous statistical mechanics as just a phenomeno- 
logical statement, though the notion of the surface 
tension was among its central notions. The situation 
changed after the appearance of the book by 
Dobrushin et al. (1992). There it was shown that 
in the setting of the canonical ensemble formalism, 
in the regime of the first-order phase transition, the 
(random) shape occupied by one of the phases has 
asymptotically (in the thermodynamic limit) a 
nonrandom shape, given precisely by the Wulff 
construction! In other words, a typical macroscopic 
random droplet looks very close to the Wulff shape. 

In what follows we will explain the above result. 
Another important application of the concepts 
introduced above — the role played by the Wulff 


shapes in the theory of metastability — is also 
described (see Metastable States). 


Crystals in the Ising Model 


Ising spins o, take values +1, with x € Z^. We will 
wrap Z^ into a torus TZ by taking a factor lattice: 
Td =Z*/NZ2. Ising-model grand canonical Gibbs 
state in TÉ is the probability measure TAE 


ux (e) = Zp exp(—BHn(0)) 


Here Hn(0)= — Liz ynasx, yeTé, Oxy B>0 is the 
inverse temperature, and Zw, 5 is the normalization 
factor. Ising-model canonical Gibbs state in TZ 
is the probability measure pr , obtained from T by 
taking its conditional distribution: 


Hy? (0) = Hh (« DIES ont) pl « 1 


d 
x€ TÑ 


(Here we make a slight abuse of notation. More 
precisely, since o,=+1, one has to consider 
the conditioning 3^0, — oxN?, where py—p as 
N-— oo, while the numbers (1— py)N* are even 
integers; otherwise the condition is empty.) We will 
characterize the canonical state Ts " by describing the 
properties of contours, (5;(c)), of configuration o. 
Contours ^; of configuration o are hypersurfaces 
made of elementary (d — 1)-dimensional unit cubes of 
the dual lattice, which separate the nearest-neighbor 
(n.n.) points x,y € p where 0, Æ oy. 

Suppose that the temperature 37? is low enough, 
while the density parameter p satisfies the constraints: 
mal0) > p > 8a 
Here m*((3) is the spontaneous magnetization of the 
d-dimensional Ising model, while gy is some geo- 
metric factor, the role of which will be explained 
later. The above constraint forces some amount of 
the (—)-phase into the (+)-phase. It turns out that 
this amount gathers into one big droplet, which has 

approximately the Wulff shape. 
We first formulate the known rigorous results for 
the case d — 2, and then indicate some extensions. 


Two-Dimensional Case 


The following holds with pe? -probability approach- 
ing 1 as N— oc: 


e The set (^;(o)] of contours of o has precisely one 
“big” contour, I(c); the diameters of other 
contours do not exceed K InN, K= K(9). 

e The area |IntP'(0)| inside T (ø) satisfies 


| IntT (c)| — vN?| < KN” (In N)“ 
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where 


y 30) — lel 


2m (8) x = x(p) 


e There is a point x =x(c) - the “center” of T (o) — 
such that the shift of T'(c) by —x(0) brings the 
contour I'(c) very close to the scaled Wulff curve, 
defined by the Ising-model surface tension 7: 


ra (ro -x(0), Jaw < KN*(In N)* [2] 


(Here ry is the Hausdorff distance: for every two 
sets A, C € R%,r4(A,C) is defined as max{inf[r: 
A C C+B,], inf[r: CC A+ B,]], where B, is the 
ball of radius r.) 


The proof of the above result is the content of 
the book by Dobrushin et al. (1992). In the 
two-dimensional case, it remains true for all 
temperatures 3? below the critical one (Ioffe and 
Schonmann 1998). The value 2/3 of the exponent is 
an improvement of the original 3/4 result 
(Alexander 1992). Probably, it can be further 
improved down to 1/2. Though Dobrushin et al. 
(1992) treat only the Ising model, their results are 
valid for a wide range of other models. 

The restriction p > gy in the theorem is needed 
because without it the droplet may prefer to assume 
the shape of a strip between two meridians rather 
than to take the Wulff shape. 


Three-Dimensional Case 


In the case d=3 or d > 3, the statement that a 
typical configuration o has only one big contour 
I'(c) is still true. But the analog of [2] is not known. 
It is natural to conjecture that it holds at low 
temperatures, even in a stronger version, with only a 
logarithmic term K(In N)* in the RHS. It probably 
fails at higher subcritical temperatures. 

What is known to hold is a weaker version of this 
theorem, where the distance between random 
droplet and the Wulff shape is measured not in 
Hausdorf distance, but in L! sense. To state the 
corresponding theorem, we will associate with every 
configuration c on a lattice torus T a real-valued 
function M,(t) on the unit torus TR? /Z*, 
and we then compare this function with the 
indicator function I,x_, where sK, C T? is the Wulff 
body, properly scaled. 

The function M,(t) is defined as follows. We 
denote by in the natural embedding of the discrete 
torus Té, into TŻ, the image of in being the grid with 
spacing 1/N. For t € T7 we define by(t) C T^ to be 


464 Wulff Droplets 


the ball centered at ¢ with radius (/1/N, and let 

By(t) C A(N) be its preimage under iy. Then 

| 4 
|Bn(t)| 


M,(r) a(x) 


xEBn (t) 


We have to expect to see a droplet sK, with 


E d m) —p 
w, 2m*(B) 


Let us introduce for every subset A C T^ the 


indicator 
1, 
la (t) = Li 


For every function v in L'(T?) we denote by U(v, 6) 
its $-neighborhood in L! (T^). 

The result can now be formulated. Suppose the 
temperature 3"! is below the critical one. Then the 
function M,(t) is close to the characteristic function 
of the Wulff shape: For every 6 > 0 


tcA 
t c AS 


! 1 
lim 14? —— M) € | | Ulka) $ = 1 
HN m*(B) ( ) U ( K,+t ) 


The shifts by all t-s of the Wulff shape sK, appear 
in the statement since the location of the droplet can 


be arbitrary. Note that if a point £ is such that the 
ball By(t) stays away from the boundary of the 


droplet l'(c) present in the configuration oc, then the 
value M,(t) should be expected to be +m'(8), 
depending on whether ¢ is outside or inside the 
droplet, which explains the factor 1/77(8). 

For a proof, see Bodineau (1999) and Cerf and 
Pizstora (1999). 


See also: Cluster Expansion; Large Deviations in 
Equilibrium Statistical Mechanics; Metastable States; 
Percolation Theory; Statistical Mechanics of Interfaces. 
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Introduction 


The term Yang-Baxter equations (YBEs) was coined 
by Faddeev in the late 1970s to denote a principle of 
integrability, that is, exact solvability, in a wide 
variety of fields in physics and mathematics. Since 
then it has become a common name for several 
classes of local equivalence transformations in 
statistical mechanics, quantum field theory, differ- 
ential equations, knot theory, quantum groups, and 
other disciplines. We shall cover the various versions 
and their relationships, paying attention also to the 
early historical development. 


Electric Networks 


The first such transformation came up as early as 
1899 when the Brooklyn engineer Kennelly pub- 
lished a short paper, entitled “The equivalence of 
triangles and three-pointed stars in conducting net- 
works.” This work gave the definite answer to such 
questions as whether it is better to have the three 
coils in a dynamo - or three resistors in a network — 
arranged as a star or as a triangle, see Figure 1. 
Using Kirchhoff's laws, the two situations in Figure 1 
can be shown to be equivalent provided 


Z1Z1 = Z2Z7 = Z3Z3 
= 2122+ 22234+ 2321 [1] 
5 Z12223/(Z +, Zo + Z3) [2] 


Here one has to take either [1] or [2] as second line 
of the equation, depending on which direction the 
transformation is to go. The star—triangle transfor- 
mation thus defined is also known under other 
names within the electric network theory literature 
as wye-delta (Y — A), upsilon-delta (Y — A), or 
tau-pi (T — II) transformation. 


Spin Models 


When Onsager wrote his monumental paper on the 
Ising model published in 1944, he made a brief 
remark on an obvious star-triangle transformation 
relating the model on the honeycomb lattice with 
the one on the triangular lattice. His details on this 
were first presented in Wannier's review article of 
1945. However, the star-triangle transformation 
played a much more crucial role in Onsager's 
reasoning, as it is also intimately connected with 
his elliptic function uniformizing parametrization. 

Furthermore, it implies the commutation of 
transfer matrices and spin-chain Hamiltonians. 
Only in his Battelle lecture of 1970 did Onsager 
explain how he used this remarkable observation in 
his derivation of the formula for the spontaneous 
magnetization which he had announced as a 
conference remark in 1948 and of which the first 
complete derivation had been published by Yang in 
1952 using a completely different method. 

Many other applications and generalizations have 
since appeared. Most generally, we can consider a 
system whose state variables — also called spins — take 
values from some suitable discrete or continuous sets. 
The interactions between spins a and b are given in 
terms of weight factors W,, and W,,, which are 
complex numbers in general, see Figure 2. One 
quantity of special interest is the partition function — 
sum of the product of all weight factors over all 
allowed spin values. The integrability of the model is 
expressed by the existence of spectral variables — 
rapidities p, q,7,... — that live on oriented lines, two 
of which cross between a and b as indicated by the 
dotted lines in Figure 2. Arrows from a to b are added 
to keep track of the ordering of a and b in case the 
weights are chiral (not symmetric). 

In Onsager's special choice of the Ising model the 
spins take values a,b,c,... = +1 and the weight 
factors are the usual real positive Boltzmann weights 
depending on the product ab= +1, uniformizing 
variable p — q, and elliptic modulus k. In the integra- 
ble chiral Potts model the weights depend on a — b 
mod N, with a, b — 1,..., N, whereas the rapidities p 
and q are living in general on a higher-genus curve. 
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Figure 1 Star—triangle equation for impedances. 


Wap Wap 


Figure 2 Spin model weights W.»(p, q) and Wap(p, q). 


When the weights are asymmetric in the spins, there 
are two sets of star-triangle equations which can be 
expressed both pictorially (Figure 3) and algebraically: 


EY Walp, q4)W alq, r) Waa (p, r) 
d 
= R(p,9, 1) Wal, 9) W.a(q,r) W.(p, 1) [3] 
R(p, d, r) Wap (p, q) Wac(9, r) Woe (p, r) 
=> Wac(b, q) W;a(q,r) Wap, r) [4] 
d 


Note that eqns [3] and [4] differ from each other by the 
transposition of both spin variables in all six weight 


Figure 3 Star—triangle equation. 


factors. In general, there may also appear scalar factors 
R(p, q,r) and R(p, q, r), which can often be eliminated 
by a suitable renormalization of the weights. If a, b, 
and c take values in the same set, we can sum over 
a — b — c, showing that R — R in that case. 

The Kennelly star-triangle equation [1], [2] can be 
recovered as a special limit of a spin model where 
the states are continuous variables. 


Knot Theory and Braid Group 


A seemingly totally different situation occurs in the 
theory of knots, links, tangles, and braids. In 1926, 
Reidemeister showed that only three types of moves 
suffice to show the equivalence between two 
different configurations, see Figure 4. Moves of 
type I — removing simple loops — do not apply to 
braids. Moves of type II, for which one strand crosses 
twice over another strand, can be reformulated for 
braids, namely that an overcrossing is the inverse of 
an undercrossing. The Reidemeister move of type III 
is a precursor of the more general Yang-Baxter 
moves and can be represented also by the defining 
relations of Artin's braid group. Let Rj;,;; be the 
operator representing the situation in which the 
strand in position 7 crosses over the one in position 
i+1. Then a braid can be represented by a product 
of Rj, j.1’s and their inverses, provided 


Ri Riv 2Riin =RaurimRimRiajra [5] 
and 
Riv, Rij+1] =0, if [1 ¿| > 2 [6) 
and similar relations in which R; ;,; and/or R;,1 ¿+2 
are replaced by their inverses. 


Factorizable S-Matrices and Bethe Ansatz 


In the early 1960s, Lieb and Liniger solved the one- 
dimensional Bose gas with delta-function interaction 
using the Bethe ansatz. Yang and McGuire then tried 
to generalize this result to systems with internal 
degrees of freedom and to fermions. This led to the 


y. Y. 
op) X 


/ 


E \ \ 


Figure 4 Reidemeister moves of types |, Il, and Ill. 


Figure 5 Vertex model YBE. 


discovery of the condition for factorizable S-matrices 
by McGuire in 1964, represented pictorially by 
Figure 5, where the world lines of the particles are 
given. Upon collisions the particles can only exchange 
their rapidities p, q,r, so that there is no dispersion. 
Also indicated are the internal degrees of freedom in 
Greek letters. In other words, the three-body S-matrix 
can be factorized in terms of two-body contributions 
and the order of the collisions does not affect the 
final outcome. McGuire also realized that this 
condition is all one needs for the consistency of 
factoring the m-body S-matrix in terms of two-body 
S-matrices. The consistency condition is obviously 
related to the Reidemeister move of type III in 
Figure 4. 

Yang succeeded in solving the spin-1/2 fermionic 
model using a nested Bethe ansatz, utilizing a 
generalization of Artin’s braid relations [5] and [6], 


r)Riia (d — fJ 


r) Ri visa(p — q) [7] 


He submitted his findings in two short papers in 
1967. The R operators in eqn [7] — a notation 
introduced later by the Leningrad school — depend 
on differences of two momenta or two relativistic 
rapidities. Sutherland solved the general spin case 
using repeated nested Bethe ansátze, while Lieb and 


Wu used Yang's work to solve the one-dimensional 
Hubbard model. 


R;na(p— q)R Rita (p — 


= Risriala — r)Riia(p — 


Vertex Models 


Since Lieb’s solution of the ice model by a Bethe 
ansatz, there have been many developments on 
vertex models, in which the state variables live on 
line segments and weight factors w), are assigned to 
a vertex where four line segments with the four 
states a, H, A, 3 on them meet, see Figure 6. 

Baxter solved the eight-vertex model in 1971, using 
a method based on commuting transfer matrices, 
starting from a solution of what he then called the 
generalized star-triangle equation, but what is now 
commonly called the Yang-Baxter equation (YBE): 
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Ww W w 
Figuro 6 Vertex model weight wà; 
Wa%15(p, q) and IRF model weight w 


ap 


^(p, q), mixed model weight 
25 (p. q). 


> 3 ufa E (p, q (q, ru Br ? (p, r) 
p 3 Sip qui" (qr) (pr) [8] 


( y" p" y dd 


This equation is represented graphically in Figure 5. 
From it one can also derive a sufficient condition for 
the commutation of transfer matrices and spin-chain 
Hamiltonians, generalizing the work of McCoy and 
Wu, who had earlier initiated the search by showing 
that the general six-vertex model transfer matrix 
commutes with a Heisenberg spin-chain Hamilto- 
" To be more precise, Baxter found that if 

Wa, — 6,6, for some choice of p and q, some spin- 
chain Hamiltonians could be derived as logarithmic 
derivatives of the transfer matrix. 


Interaction-Round-a-Face Model 


Baxter introduced another language, namely that of the 
IRF or “interaction-round-a-face” model, which he 
introduced in connection with his solution of the hard- 
hexagon model. This formulation is convenient when 
studying one-point functions using the corner-transfer- 
matrix method. Now the integrability condition can be 
represented graphically as in Figure 7 or algebraically as 


» p. qwe (a, rw b.) 


= 0, (p,q) we, (q,r) w^ (p, r) [9] 


The spins live on faces enclosed by rapidity lines and 
the weights w% (p,q) are assigned as in Figure 6. 


IRF model YBE. 


Figure 7 
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Baxter discovered a new principle based on eqns [8] 
and [9], which he called Z-invariance, as it expresses 
an invariance of the partition function Z under moves 
of rapidity lines. This also implies that typical one- 
point functions are independent of the values of the 
rapidities, while two-point functions can only depend 
on the values of the rapidities of rapidity lines crossing 
between the two spins considered. Many recent results 
on correlation functions in integrable models depend 
on this observation of Baxter. 


IRF-Vertex Model 


In Figure 6, we have also "Wa mixed IRF-vertex 
model weights wn “p(s 4). (We could put further 
state variables on the vertices, but then the natural 
thing to do is to introduce new effective weights 
summing over the states at each vertex.) With the 
choice made a more general YBE can be represented 


as in Figure 8, or by 
nan WA 
2,22, 2 Wi le (P,a) 
o" B aft 
"B' de 
X Wi, ny" PCR, Ws; le 4 (P, r) 
YYY o 
nl 
a" pr a 
"l ic IAN a'b 
x Wir lalar) Wi» leg (0,7) 10] 


ay By" 


Quantum Inverse-Scattering Method 


The Leningrad school of Faddeev incorporated the 
methods of Baxter and Yang in their so-called 


Figure 9 Checkerboard versions of the weights. 


Figure 8 General YBE. 


quantum inverse-scattering method (QISM), coining 
the term quantum YBEs (QYBEs) for eqns [8]. If 
special limiting values of p and q can be found, say as 
b — 0, such that wt = = 6505 + O(5), one can reduce 
[8] to the classical Vane-Baxter equations (CYBEs) by 
expanding up to the first nontrivial order in expansion 
variable b. These determine the integrability of certain 
models of classical mechanics by the inverse-scattering 
method and the existence of Lax pairs. 


Checkerboard generalizations 


Star-triangle equations [3] and [4] imply that there are 
further generalizations of the YBEs, namely those for 
which the faces enclosed by the rapidity lines are 
alternatingly colored black and white in a checkerboard 
pattern. We can then introduce either vertex model 
weights Way Ds d q) UD, q ), or IRF-vertex model 
weights wes ab (p,q) and Wikio, ), or IRF 
model weights wt id ) and w^ (b, q ), see Figure 9. 
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The black faces are those where the spins of the 
spin model with weights defined in Figure 2 live; the 
white faces are to be considered empty in Figures 2 
and 3 (or, equivalently, they can be assumed to host 
trivial spins that take on only a single value). 
Clearly, the IRF-vertex model description contains 
all the other versions. 


Checkerboard Vertex Model 


First we consider the checkerboard vertex model 
with weights wa (D, q ) and wa" (p,q) as assigned in 
Figure 9. The YBE [8] then generalizes to two sets of 
equations: 


39309307 MC) 


wil (a, ry (p, r) 


o" p" y" 
—a’ B' 
= R(p.q.r) > Dd > Daw (Pa) 
c" p" y" 
x DY" (q,r)u (p, 1) [11] 
R R(p, q,r) ")* Y V us (p. Dan (q,r r)w} us P (p. r) 
o" p" ey! 
o! 3! "e" Am p" 
-M 9 (pq) ze (qr (p) [12 
o" p" a" 


where scalar factors R and R have been added as in 
[3] and [4]. These equations are represented graphi- 
cally by Figure 10. 


Figure 10 Checkerboard vertex model YBE. 
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Checkerboard IRF Model 


The checkerboard IRF version of the YBE [8] 
becomes 


Y wj b. q) (q, rw, p.) 
d 


= R(p.q.r) » wap, wa (q.r)wes(p,r) [13] 
d' 


R(p, q,r P» p, qw (4, rw b, r) 


B 72 Mis (p,q) wiv (q, r wep (D. r) [14] 


again with scalar factors R and R added as in [3] 
and [4]. These equations can now be represented 
graphically as in Figure 11. Note that these 
equations reduce to eqns [3] and [4] if the spins on 
the white faces are allowed to take only one value, 
which means that they can be ignored. 


Checkerboard IRF-Vertex Model 


Finally, the most general case is represented by the 
checkerboard IRF-vertex model, with weights 
defined in Figure 9. For this case the YBEs are 
given by 


ioi 2 Waa lev (Ps d) 
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with its graphical representation in Figure 12. 
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Figure 12 Checkerboard YBE. 


Formal Equivalence of Languages 
The Square Weight 


Combining four weights of a checkerboard model in 
a square, as is done with four spin model weights 
in Figure 13, we find a regular vertex model weight 
with rapidities that are now pairs of the original 
ones. This gives 


Wop (P1 qi )W a(pi , q2)W apa, qi )Wha(pa. q2) 
= wX (P1; 225 91; 92) [17] 


(94,92) 


Figure 13 Square weight as vertex weight. 


From any solution of [3] and [4] we can thus 
construct a solution of YBE [8]. This has been used 
by Bazhanov and Stroganov to relate the integrable 
chiral Potts model with a cyclic representation of the 
six-vertex model. 


Map to Checkerboard Vertex Model 


The checkerboard IRF-vertex model formulation 
contains all other versions mentioned above as 
special cases. However, collecting the state variables 
in triples, we can immediately translate it to a vertex 
model version, writing 


—\3 dc 


wie (D, q) = Was lp, q), wi (D. q) = W a ylab(P: 9) 
| = (d,A 3 = (b, 8, 
7 A= (dh Ach $=(06,8,€) 18 
&= (aadh k= (a, pu, b) 


w (b, q) = a (p,q) =0 otherwise [19] 


In eqn [19], we have set all vertex model weights 
zero that are inconsistent with IRF-vertex config- 
urations. Clearly, the translation of IRF models and 
spin models to vertex models can be done similarly. 


Map to Spin Model 


We can, furthermore, translate each vertex model 
with weights assigned as in Figures 6 or 9 into a spin 
model with weights as in Figure 2 by defining 
suitable spins in the black faces, after checkerboard 
coloring. Each spin is then defined to be the ordered 
set of states on the line segments of the vertex 
model, a= (o1,02,...), ordering the line segments 
counterclockwise starting at, say, 12 o'clock. We 
can then identify wy (p, q) = Wa, (p,q), Da, (p, q) = 
W..»(p,q). This is surely not very economical, as 
many of the weights will be equal, but it helps show 
that all different versions of the checkerboard YBE 
are formally equivalent. 

Hence, we shall only use the vertex model 
language in the following. It is fairly straightforward 
to convert to the other formulations. 


An sl(m|n) Example 


One fundamental example is a O-state model for 
which the rapidities have 20+1 components, 
pu (P-o, Ez ,Po), q= (4-0; eee ,90), etc., and the 
states on the line segments are arranged in strings 
of continuing conserved color. The vertex weights, 
for a, 3, A, w=1,...,O, are given by 


uM (p,q) = wm (Do, do aa w po 
d+a —H 
with (p Æ 0) 
¿opp(Po, qo) = N sinh[n + ep(po — q0)] 
ee, Po, qo NG Goo sinh(po — qo) 


Ne (po— qo)sign(p— 0) sinh n [21] 


0, otherwise 


ah j= 
( d 
wosp (Do. do) = 
o (Do, qo) = 
where N is an arbitrary overall normalization factor 
and 7 is a constant. Furthermore, e,= +1 for 
p —1,..., O, where m of them equal +1 and n of 
them equal —1. The G,,’s are constants satisfying 
Goo = 1/G,,, which freedom is allowed because the 
number of p-o crossings minus the number of o-p 
crossings is fixed by the states on the boundary only, 
that is, the choice of o, a’, B, 8”, y, y in YBE [8] and 
Figure 5. 

The solution [20], [21] has many applications. 
The case m — 0, 1 — 2 leads to the general six-vertex 
model; the m=0,n=n case produces the funda- 
mental intertwiner of affine quantum group U,¿SÍ(n), 
whereas the case m=2,n=1 corresponds to the 
supersymmetric one-dimensional t-J] model. 


Operator Formulations 
The H-Matrix 


For a problem with N rapidity lines, carrying 
rapidities pi,...,pw, we can introduce a set of 
matrices Rj(pi, p;), for 1 «i < j € N, with elements 


Rabble mu up) |] 122 
kzi,j 


In terms of these, the YBE [8] can be rewritten in 
matrix form as 


Rj.(pj, De) Rik (Di, Pe) Rij (Di; pi) 


[23] 
= Ri(pi pj) Rig (Di, Pr)Ri (Dj, Pre) 


where 1<i<j/<k<N. 
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The A-Matrix 


If we transpose the 8 indices 5; and . B; in eqn [22], 
we can define a set of matrices R; ixi(p,q) with 
elements 


Rialp, a pl zie mi lI 6» [24] 
ki itl 


Using these, the YBE [8] can be rewritten in matrix 
form as 


R; ix1i(q, r) Ra. rial, r) Rip, q) 
= Ria i+2(D, 9) R; i1 (p, r)R i+1,14+2(4,7) [25] 


and 


Ř jss(r,s)]=0, ifli-j»2 [26 


[R i, i+1(P; 9), 


In this formulation, it is clear that many solutions 
can be found “Baxterizing” Temperley-Lieb and 
Iwahori-Hecke algebras. 


Classical YBEs 
If we expand 
Ri(pi, pi) = 1--bX;(pi,bj) +O) — 27] 


in [23], we get in second order in $ the classical YBE 
(CYBE) as the vanishing of a sum of three commu- 
tators, that is, 


Xi (Di, Pj), Xi (Dis Pe)| + [Xi(Di, bi). X je (Dj, Pr) 
+ [Xi (Dis Pe), Xin (Dj, Px)] = 9 [28] 


introduced by Belavin and Drinfel'd, where X; is 
called the classical r-matrix. 


Reflection YBEs 


Cherednik and Sklyanin found a condition deter- 
mining the solvability of systems with boundaries, 
the reflection YBEs (RYBEs), see Figure 14. Upon 


Figure 14 Reflection YBE. 
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collisions with a left or right wall the rapidity 
variable changes from p to p and back. In most 
examples, in which the rapidities are difference 
variables such that R(p, q) =R(p — q), one also has 
p= — p, with y some constant. The corresponding 
left boundary weights are K*(p, p) satisfying 


Ki (q. q) Ri (p, q)i (p, p) Ria (q, p) 
= Rio (p,q) Ki (p, P)Ri2(q, p) Ka (q.q) [29] 


with K;(p,p) defined by a direct product as in [24] 
appending unit matrices for positions 722, and a 
similar equation must hold for the right boundary. 
Most work has been done for vertex models, while 
Pearce and co-workers wrote several papers on the 
IRF-model version. 


Higher-Dimensional Generalizations 


In 1980 Zamolodchikov introduced a three- 
dimensional generalization of the YBE, the so-called 
tetrahedron equations (TEs), and he found a special 
solution. Baxter then succeeded in proving that 
this solution satisfies all TEs. Baxter and Bazhanov 
showed in 1992 that this solution can be seen as 
a special case of the sl(oo) chiral Potts model. 
Several authors found further generalizations more 
recently. 


Inversion Relations 


When w (p, p) x 6507, that is, the weight decouples 
when the two rapidities are equal, one can derive the 
local inverse relation depicted in Figure 15, which is 
a generalization of the Reidemeister move of type II 
in Figure 4. It is easily shown that C(q, p) = C(p, q). 

This local relation implies also a global inversion 
relation which can be found in many ways. The 
following heuristic way is the easiest: consider the 
situation in Figure 16, with N closed p-rapidity lines 
and M closed q-rapidity lines. For M and N large, 
we may expect the partition function of Figure 16 
to factor asymptotically in top- and bottom-half 
contributions. If each line segment carries a state 


Figure 15 Local inversion relation. 


p 


Figure 16  Heuristic derivation of inversion relation. 


variable that can assume O values, then the total 
partition function factors by repeated application of 
the relation in Figure 15 into the contribution of 
M + N circles. Therefore, 


Z = QM*NC(p, q) VN ~ Zu u(p,qd)ZN,w(q.p) [30] 


Taking the thermodynamic limit, 


ap q)= lim Zun(p ^ BL 
one finds 
z(p, q)2(q,p) = Clq, p) [32] 


In many models, eqn [32], supplemented with some 
suitable symmetry and analyticity conditions, can be 
used to calculate the free energy per site. 


See also: Affine Quantum Groups; Bethe Ansatz; 
Classical r-matrices, Lie Bialgebras, and Poisson Lie 
Groups; Eight Vertex and Hard Hexagon Models; Hopf 
Algebras and q-Deformation Quantum Groups; 
Integrability and Quantum Field Theory; Integrable 
Discrete Systems; Integrable Systems: Overview; The 
Jones Polynomial; Knot Invariants and Quantum Gravity; 
Knot Theory and Physics; Sine-Gordon Equation; 
Topological Knot Theory and Macroscopic Physics; 
Two-Dimensional Ising Model; von Neumann Algebras: 
Subfactor Theory. 
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